Better error message when ; is mistakenly in the middle of class definition #5

Open
opened 2025-12-22 23:25:26 +00:00 by michaliskambi · 2 comments
michaliskambi commented 2025-12-22 23:25:26 +00:00 (Migrated from github.com)

Mistakenly placing ; in the middle of class definition, like this:

let hudObject = {
  bbb: {
    x: 1
  },

  updateStep: function(deltaTime) {
  };

  aaa: function() {
  }
};

puts("HUD object loaded");

makes an error

[Exception(EPOCASyntaxError["poca/hud.poca"):8,2]: Bad hash initializer

Can we improve the message, to make it clearer that the syntax is incorrect, ; is unexpected here?

Looking at what JS does in this situation:

  1. Firefox JS engine answers

    Uncaught SyntaxError: missing } after property list
    note: { opened at line 1, column 17
    
  2. https://runjs.app/play answers even better, pointing to the right solution:

    SyntaxError: Unexpected token, expected "," (7:3)
    
Mistakenly placing `;` in the middle of class definition, like this: ``` let hudObject = { bbb: { x: 1 }, updateStep: function(deltaTime) { }; aaa: function() { } }; puts("HUD object loaded"); ``` makes an error ``` [Exception(EPOCASyntaxError["poca/hud.poca"):8,2]: Bad hash initializer ``` Can we improve the message, to make it clearer that the syntax is incorrect, `;` is unexpected here? Looking at what JS does in this situation: 1. Firefox JS engine answers ``` Uncaught SyntaxError: missing } after property list note: { opened at line 1, column 17 ``` 2. https://runjs.app/play answers even better, pointing to the right solution: ``` SyntaxError: Unexpected token, expected "," (7:3) ```
BeRo1985 commented 2025-12-23 00:40:54 +00:00 (Migrated from github.com)

You're right that the error messages should be more clear in this case, like "Unexpected ;" or "expected ,", similar to what modern JS engines provide. Unfortunately, this isn't as easy to implement in POCA as one might think. But I do want to try to explain why it is currently how it is at the moment.

When you write code in POCA, it doesn't just get read from left to right and parsed in one go, as you might expect from traditional parsers like recursive descent or LL(k) parsers. Instead, it goes through several stages, each one transforming the code in different ways:

As first step, the Lexer reads your source code character by character and turns it into a stream of tokens (symbols that represent keywords, operators, identifiers, etc.).

As second step, then comes the Transformer. The Transformer takes that token stream and rewrites parts of it. For example, if you write an arrow function like x => x * 2, the Transformer converts that into a regular function definition. It does the same for import statements, the super keyword, and various other syntactic conveniences. By the time the Transformer is done, your code has been already reorganized quite a bit.

As third step, POCA uses a technique called precedence climbing to parse expressions correctly. It takes the transformed full token stream and builds a re-organized token tree, but not yet a full syntax tree. And this isn't a simple linear process. The precedence climbing algorithm uses 33 different precedence levels for operators. This allows it to correctly interpret complex expressions, but it also means that the parsing process is quite intricate. Here it is a part of the cause of the confusing error messages, because here in this step some error messages are already generated, before the full syntax tree is even built, and before the real parser has a chance to analyze the code in full context.

As fourth and final step, the parser takes that re-organized token tree, builds a full syntax tree, and generates the final bytecode that will be executed.

So, the "Precedence level overflow" error message comes at the very end the precedence climbing step but before the actual parser, when it has exhausted all precedence levels and doesn't know what to do with the remaining tokens. It's essentially an emergency brake: "Something doesn't fit together here."

The "Bad hash initializer" in your example happens because the ; inside the object makes POCA think a new statement is starting, but we're still in the middle of an object literal.

The problem: By the time the error is detected, the code has already gone through so many transformations that the original source position is partially lost. That's why the line number sometimes points to the beginning of the statement rather than the exact error location.

Better error messages would be possible, but would require significant refactoring of the transformer/parser to preserve context information better. It's already on the wishlist, but not a small undertaking.

And the fact that in POCA, almost everything is an expression, including statements, is also indirectly related to this. You can do in POCA things like

let x = if(cond) { a } else { b }; 

or

let y = { let a = 0; while(a < 10) { a = a + 1; } a; };

and similar constructs, which is not possible in traditional languages with statement/expression separation.

You're right that the error messages should be more clear in this case, like "Unexpected ;" or "expected ,", similar to what modern JS engines provide. Unfortunately, this isn't as easy to implement in POCA as one might think. But I do want to try to explain why it is currently how it is at the moment. When you write code in POCA, it doesn't just get read from left to right and parsed in one go, as you might expect from traditional parsers like recursive descent or LL(k) parsers. Instead, it goes through several stages, each one transforming the code in different ways: As first step, the Lexer reads your source code character by character and turns it into a stream of tokens (symbols that represent keywords, operators, identifiers, etc.). As second step, then comes the Transformer. The Transformer takes that token stream and rewrites parts of it. For example, if you write an arrow function like `x => x * 2`, the Transformer converts that into a regular function definition. It does the same for import statements, the super keyword, and various other syntactic conveniences. By the time the Transformer is done, your code has been already reorganized quite a bit. As third step, POCA uses a technique called precedence climbing to parse expressions correctly. It takes the transformed full token stream and builds a re-organized token tree, but not yet a full syntax tree. And this isn't a simple linear process. The precedence climbing algorithm uses 33 different precedence levels for operators. This allows it to correctly interpret complex expressions, but it also means that the parsing process is quite intricate. Here it is a part of the cause of the confusing error messages, because here in this step some error messages are already generated, before the full syntax tree is even built, and before the real parser has a chance to analyze the code in full context. As fourth and final step, the parser takes that re-organized token tree, builds a full syntax tree, and generates the final bytecode that will be executed. So, the "Precedence level overflow" error message comes at the very end the precedence climbing step but before the actual parser, when it has exhausted all precedence levels and doesn't know what to do with the remaining tokens. It's essentially an emergency brake: "Something doesn't fit together here." The "Bad hash initializer" in your example happens because the `;` inside the object makes POCA think a new statement is starting, but we're still in the middle of an object literal. The problem: By the time the error is detected, the code has already gone through so many transformations that the original source position is partially lost. That's why the line number sometimes points to the beginning of the statement rather than the exact error location. Better error messages would be possible, but would require significant refactoring of the transformer/parser to preserve context information better. It's already on the wishlist, but not a small undertaking. And the fact that in POCA, almost everything is an expression, including statements, is also indirectly related to this. You can do in POCA things like ```js let x = if(cond) { a } else { b }; ``` or ```js let y = { let a = 0; while(a < 10) { a = a + 1; } a; }; ``` and similar constructs, which is not possible in traditional languages with statement/expression separation.
michaliskambi commented 2025-12-23 02:17:05 +00:00 (Migrated from github.com)

Thanks for the answers. I understand it's not an easy task to improve the error messages :) From the point of view of the user, even changing the existing error messages (without changing the logic to generate them) would be some improvement.

  • "Precedence level overflow" -> the explanation you give "Something doesn't fit together here" is actually already better :) In the end, users don't know the exact transformations and 33 precedence levels the code goes through :)

  • "Bad hash initializer" -> the explanation you give "a new statement is starting, but we're still in the middle of an object literal" is already better. Though I understand the error may also occur in other contexts. Is it possible to formulate it in a way that is broader, accounts for other contexts where it can occur, but is still more helpful? :)

Thanks for the answers. I understand it's not an easy task to improve the error messages :) From the point of view of the user, even changing the existing error messages (without changing the logic to generate them) would be some improvement. - _"Precedence level overflow"_ -> the explanation you give _"Something doesn't fit together here"_ is actually already better :) In the end, users don't know the exact transformations and 33 precedence levels the code goes through :) - _"Bad hash initializer"_ -> the explanation you give _"a new statement is starting, but we're still in the middle of an object literal"_ is already better. Though I understand the error may also occur in other contexts. Is it possible to formulate it in a way that is broader, accounts for other contexts where it can occur, but is still more helpful? :)
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
BeRo1985/poca#5
No description provided.