Skip to content

Proposed changes for 0.1.0 #349

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 11, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
18 changes: 5 additions & 13 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -1,13 +1,10 @@
language: php

php:
- 7.0
- 7.1
- 7.2
- 7.3
- 7.4
# Should be changed to 8.0 once travis officially provides that label.
- nightly
- '7.2'
- '7.3'
- '7.4'
- '8.0'

env:
- VALIDATION=false
Expand All @@ -30,12 +27,7 @@ cache:

before_script:
- if [[ $STATIC_ANALYSIS = true ]]; then composer require phpstan/phpstan --no-update; fi
- |
if php -r 'exit(PHP_MAJOR_VERSION < 8 ? 0 : 1);';
then composer install
else
composer install --ignore-platform-reqs
fi
- composer install
- set -e # Stop on first error.
- phpenv config-rm xdebug.ini || true
- if find . -name "*.php" -path "./src/*" -path "./experiments/*" -path "./tools/*" -path "./syntax-visualizer/server/src/*" -exec php -l {} 2>&1 \; | grep "syntax error, unexpected"; then exit 1; fi
Expand Down
42 changes: 22 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,15 @@
[![Build Status](https://travis-ci.org/Microsoft/tolerant-php-parser.svg?branch=master)](https://travis-ci.org/Microsoft/tolerant-php-parser)

This is an early-stage PHP parser designed, from the beginning, for IDE usage scenarios (see [Design Goals](#design-goals) for more details). There is
still a ton of work to be done, so at this point, this repo mostly serves as
still a ton of work to be done, so at this point, this repo mostly serves as
an experiment and the start of a conversation.

![image](https://cloud.githubusercontent.com/assets/762848/19023070/4ab01c92-889a-11e6-9bb5-ec1a6816aba2.png)

This is the v0.1 branch, which changes data structures to support syntax added after the initial 0.0.x release line.

## Get Started
After you've [configured your machine](docs/GettingStarted.md), you can use the parser to generate and work
After you've [configured your machine](docs/GettingStarted.md), you can use the parser to generate and work
with the Abstract Syntax Tree (AST) via a friendly API.
```php
<?php
Expand Down Expand Up @@ -38,17 +40,17 @@ foreach ($astNode->getDescendantNodes() as $descendant) {
// All Nodes link back to their parents, so it's easy to navigate the tree.
$grandParent = $descendant->getParent()->getParent();
var_dump($grandParent->getNodeKindName());

// The AST is fully-representative, and round-trippable to the original source.
// This enables consumers to build reliable formatting and refactoring tools.
var_dump($grandParent->getLeadingCommentAndWhitespaceText());
}

// In addition to retrieving all children or descendants of a Node,
// Nodes expose properties specific to the Node type.
if ($descendant instanceof Node\Expression\EchoExpression) {
$echoKeywordStartPosition = $descendant->echoKeyword->getStartPosition();
// To cut down on memory consumption, positions are represented as a single integer
// To cut down on memory consumption, positions are represented as a single integer
// index into the document, but their line and character positions are easily retrieved.
$lineCharacterPosition = PositionUtilities::getLineCharacterPositionFromPosition(
$echoKeywordStartPosition,
Expand All @@ -59,15 +61,15 @@ foreach ($astNode->getDescendantNodes() as $descendant) {
}
```

> Note: [the API](docs/ApiDocumentation.md) is not yet finalized, so please file issues let us know what functionality you want exposed,
> Note: [the API](docs/ApiDocumentation.md) is not yet finalized, so please file issues let us know what functionality you want exposed,
and we'll see what we can do! Also please file any bugs with unexpected behavior in the parse tree. We're still
in our early stages, and any feedback you have is much appreciated :smiley:.

## Design Goals
* Error tolerant design - in IDE scenarios, code is, by definition, incomplete. In the case that invalid code is entered, the
parser should still be able to recover and produce a valid + complete tree, as well as relevant diagnostics.
parser should still be able to recover and produce a valid + complete tree, as well as relevant diagnostics.
* Fast and lightweight (should be able to parse several MB of source code per second,
to leave room for other features).
to leave room for other features).
* Memory-efficient data structures
* Allow for incremental parsing in the future
* Adheres to [PHP language spec](https://github.com/php/php-langspec),
Expand All @@ -83,34 +85,34 @@ so each language server operation should be < 50 ms to leave room for all the
confusing, really fast, so readability and debug-ability is high priority.
* Testable - the parser should produce provably valid parse trees. We achieve this by defining and continuously testing
a set of invariants about the tree.
* Friendly and descriptive API to make it easy for others to build on.
* Friendly and descriptive API to make it easy for others to build on.
* Written in PHP - make it as easy as possible for the PHP community to consume and contribute.

## Current Status and Approach
To ensure a sufficient level of correctness at every step of the way, the
parser is being developed using the following incremental approach:

* [x] **Phase 1:** Write lexer that does not support PHP grammar, but supports EOF
* [x] **Phase 1:** Write lexer that does not support PHP grammar, but supports EOF
and Unknown tokens. Write tests for all invariants.
* [x] **Phase 2:** Support PHP lexical grammar, lots of tests
* [x] **Phase 3:** Write a parser that does not support PHP grammar, but produces tree of
* [x] **Phase 3:** Write a parser that does not support PHP grammar, but produces tree of
Error Nodes. Write tests for all invariants.
* [x] **Phase 4:** Support PHP syntactic grammar, lots of tests
* [ ] **Phase 5 (in progress :running:):** Real-world validation and optimization
* [ ] _**Correctness:**_ validate that there are no errors produced on sample codebases, benchmark against other parsers (investigate any instance of disagreement), fuzz-testing
* [ ] _**Performance:**_ profile, benchmark against large PHP applications
* [ ] **Phase 6:** Finalize API to make it as easy as possible for people to consume.
* [ ] **Phase 6:** Finalize API to make it as easy as possible for people to consume.

### Additional notes
A few of the PHP grammatical constructs (namely yield-expression, and template strings)
are not yet supported and there are also other miscellaneous bugs. However, because the parser is error-tolerant,
these errors are handled gracefully, and the resulting tree is otherwise complete. To get a more holistic sense for
where we are, you can run the "validation" test suite (see [Contributing Guidelines](Contributing.md) for more info
where we are, you can run the "validation" test suite (see [Contributing Guidelines](Contributing.md) for more info
on running tests). Or simply, take a look at the current [validation test results](https://travis-ci.org/Microsoft/tolerant-php-parser).

Even though we haven't yet begun the performance optimization stage, we have seen promising results so far,
and have plenty more room for improvement. See [How It Works](docs/HowItWorks.md) for details on our current
approach, and run the [Performance Tests](Contributing.md#running-performance-tests) on your
Even though we haven't yet begun the performance optimization stage, we have seen promising results so far,
and have plenty more room for improvement. See [How It Works](docs/HowItWorks.md) for details on our current
approach, and run the [Performance Tests](Contributing.md#running-performance-tests) on your
own machine to see for yourself.

## Learn more
Expand All @@ -119,7 +121,7 @@ own machine to see for yourself.
**:book: [Documentation](docs/GettingStarted.md#getting-started)** - learn how to reference the parser from your project, and how to perform
operations on the AST to answer questions about your code.

**:eyes: [Syntax Visualizer Tool](syntax-visualizer/client#php-parser-syntax-visualizer-tool)** - get a more tangible feel for the AST. Get creative - see if you can break it!
**:eyes: [Syntax Visualizer Tool](syntax-visualizer/client#php-parser-syntax-visualizer-tool)** - get a more tangible feel for the AST. Get creative - see if you can break it!

**:chart_with_upwards_trend: [Current Status and Approach](#current-status-and-approach)** - how much of the grammar is supported? Performance? Memory? API stability?

Expand All @@ -131,10 +133,10 @@ operations on the AST to answer questions about your code.
* [Validation Strategy](docs/HowItWorks.md#validation-strategy)

**:sparkling_heart: [Contribute!](Contributing.md)** - learn how to get involved, check out some pointers to educational commits that'll
help you ramp up on the codebase (even if you've never worked on a parser before),
help you ramp up on the codebase (even if you've never worked on a parser before),
and recommended workflows that make it easier to iterate.

---
This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or contact
This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or contact
[[email protected]](mailto:[email protected]) with any additional questions or comments.
4 changes: 2 additions & 2 deletions composer.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,10 @@
"description": "Tolerant PHP-to-AST parser designed for IDE usage scenarios",
"type": "library",
"require": {
"php": ">=7.0"
"php": ">=7.2"
},
"require-dev": {
"phpunit/phpunit": "^6.4|^7.5.20"
"phpunit/phpunit": "^8.5.15"
},
"license": "MIT",
"authors": [
Expand Down
12 changes: 6 additions & 6 deletions docs/ApiDocumentation.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,15 +10,15 @@
```php
public function getNodeKindName ( ) : string
```
### Node::getStart
### Node::getStartPosition
Gets start position of Node, not including leading comments and whitespace.
```php
public function getStart ( ) : int
public function getStartPosition ( ) : int
```
### Node::getFullStart
### Node::getFullStartPosition
Gets start position of Node, including leading comments and whitespace
```php
public function getFullStart ( ) : int
public function getFullStartPosition ( ) : int
```
### Node::getParent
Gets parent of current node (returns null if has no parent)
Expand Down Expand Up @@ -198,11 +198,11 @@ public function getFullText ( string & $document ) : string
```php
public function getStartPosition ( )
```
### Token::getFullStart
### Token::getFullStartPosition
> TODO: add doc comment

```php
public function getFullStart ( )
public function getFullStartPosition ( )
```
### Token::getWidth
> TODO: add doc comment
Expand Down
31 changes: 2 additions & 29 deletions src/FilePositionMap.php
Original file line number Diff line number Diff line change
Expand Up @@ -34,46 +34,19 @@ public function __construct(string $file_contents) {
$this->lineForCurrentOffset = 1;
}

/**
* @param Node $node the node to get the start line for.
* TODO deprecate and merge this and getTokenStartLine into getStartLine
* if https://github.com/Microsoft/tolerant-php-parser/issues/166 is fixed,
* (i.e. if there is a consistent way to get the start offset)
*/
public function getNodeStartLine(Node $node) : int {
return $this->getLineNumberForOffset($node->getStart());
}

/**
* @param Token $token the token to get the start line for.
*/
public function getTokenStartLine(Token $token) : int {
return $this->getLineNumberForOffset($token->start);
}

/**
* @param Node|Token $node
*/
public function getStartLine($node) : int {
if ($node instanceof Token) {
$offset = $node->start;
} else {
$offset = $node->getStart();
}
return $this->getLineNumberForOffset($offset);
return $this->getLineNumberForOffset($node->getStartPosition());
}

/**
* @param Node|Token $node
* Similar to getStartLine but includes the column
*/
public function getStartLineCharacterPositionForOffset($node) : LineCharacterPosition {
if ($node instanceof Token) {
$offset = $node->start;
} else {
$offset = $node->getStart();
}
return $this->getLineCharacterPositionForOffset($offset);
return $this->getLineCharacterPositionForOffset($node->getStartPosition());
}

/** @param Node|Token $node */
Expand Down
3 changes: 3 additions & 0 deletions src/MissingToken.php
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,14 @@

namespace Microsoft\PhpParser;

use ReturnTypeWillChange;

class MissingToken extends Token {
public function __construct(int $kind, int $fullStart) {
parent::__construct($kind, $fullStart, $fullStart, 0);
}

#[ReturnTypeWillChange]
public function jsonSerialize() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://wiki.php.net/rfc/internal_method_return_types adds support for tentative return types, and :mixed was later added to JsonSerializable's method in 8.1 in php/php-src#7051 . This means that implementations will have to return something (or throw), and also that the absence of a return type will emit a warning.

  • fixes vendor/microsoft/tolerant-php-parser/src/Token.php:113 [8192] Declaration of Microsoft\PhpParser\Token::jsonSerialize() should be compatible with JsonSerializable::jsonSerialize(): mixed (a warning, not an error)

Because php 7 is still supported, add an annotation to suppress the warning instead.

Eventually this may have to drop support for 7.x because the mixed type requires 8.0, but that's probably enough time, but that depends on how many 8.x minor releases there are, which is unknown

return array_merge(
["error" => $this->getTokenKindNameFromValue(TokenKind::MissingToken)],
Expand Down
Loading