Unicode case-insensitiveness #68

Open
opened 2021-05-08 14:09:29 +00:00 by benibela · 1 comment
benibela commented 2021-05-08 14:09:29 +00:00 (Migrated from github.com)

Unicode has special lower/upper case rules for certain symbols, besides ASCII cases.

For example, these should all find a match (Unicode Kelvin sign 8490 ):


  f := TFLRE.Create('k', [rfIGNORECASE]);
  writeln(f.Find('K'));
  f := TFLRE.Create('K', [rfIGNORECASE]);
  writeln(f.Find('K'));
  f := TFLRE.Create('[a-z]', [rfIGNORECASE]);
  writeln(f.Find('K'));
  f := TFLRE.Create('K', [rfIGNORECASE]);
  writeln(f.Find('k'));

Unicode has special lower/upper case rules for certain symbols, besides ASCII cases. For example, these should all find a match (Unicode Kelvin sign 8490 ): ``` f := TFLRE.Create('k', [rfIGNORECASE]); writeln(f.Find('K')); f := TFLRE.Create('K', [rfIGNORECASE]); writeln(f.Find('K')); f := TFLRE.Create('[a-z]', [rfIGNORECASE]); writeln(f.Find('K')); f := TFLRE.Create('K', [rfIGNORECASE]); writeln(f.Find('k')); ```
benibela commented 2021-05-08 15:35:56 +00:00 (Migrated from github.com)

I forgot the [rfUTF8] flag

But the first three still fail with it:

 f := TFLRE.Create('k', [rfIGNORECASE,rfUTF8]);
 writeln(f.Find('K'));
 f := TFLRE.Create('K', [rfIGNORECASE,rfUTF8]);
 writeln(f.Find('K'));
 f := TFLRE.Create('[a-z]', [rfIGNORECASE,rfUTF8]);
 writeln(f.Find('K'));
 f := TFLRE.Create('K', [rfIGNORECASE,rfUTF8]);
 writeln(f.Find('k'));

Also rfIGNORECASE is a bad name, since it collides with sysutils.rfIGNORECASE

Perhaps UTF8Find needs to be used?

 f := TFLRE.Create('k', [rfIGNORECASE,rfUTF8]);
 writeln(f.UTF8Find('K'));
 f := TFLRE.Create('K', [rfIGNORECASE,rfUTF8]);
 writeln(f.UTF8Find('K'));
 f := TFLRE.Create('[a-z]', [rfIGNORECASE,rfUTF8]);
 writeln(f.UTF8Find('K'));
 f := TFLRE.Create('K', [rfIGNORECASE,rfUTF8]);
 writeln(f.UTF8Find('k'));

But that gives the same output

I forgot the [rfUTF8] flag But the first three still fail with it: f := TFLRE.Create('k', [rfIGNORECASE,rfUTF8]); writeln(f.Find('K')); f := TFLRE.Create('K', [rfIGNORECASE,rfUTF8]); writeln(f.Find('K')); f := TFLRE.Create('[a-z]', [rfIGNORECASE,rfUTF8]); writeln(f.Find('K')); f := TFLRE.Create('K', [rfIGNORECASE,rfUTF8]); writeln(f.Find('k')); Also rfIGNORECASE is a bad name, since it collides with sysutils.rfIGNORECASE Perhaps UTF8Find needs to be used? f := TFLRE.Create('k', [rfIGNORECASE,rfUTF8]); writeln(f.UTF8Find('K')); f := TFLRE.Create('K', [rfIGNORECASE,rfUTF8]); writeln(f.UTF8Find('K')); f := TFLRE.Create('[a-z]', [rfIGNORECASE,rfUTF8]); writeln(f.UTF8Find('K')); f := TFLRE.Create('K', [rfIGNORECASE,rfUTF8]); writeln(f.UTF8Find('k')); But that gives the same output
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
BeRo1985/flre#68
No description provided.