feat: complete after error syntax #334

liuxy0551 · 2024-07-25T10:29:00Z

在错误语法的 SQL 后进行自动补全

现状举例

前方 SQL 语法错误导致光标所在位置无法准确的提示 INSERT 等关键字：

SELECT FROM tb1;
I|

错误语法后的 SELECT * FROM 被解析为多个 statement，无法进行准确的自动补全：

SELECT FROM tb1;
SELECT * FROM |

预期举例

有分号分隔时，以分号后一位作为左边界，右边界不变，将区间内的内容给到 antlr4-c3 进行解析。此时期望能够提示 INSERT 等关键字

SELECT  FROM tb1;
I|

没有分号分隔时，无法感知第一行的 sql 语句已经结束。此时无法准确的自动补全

SELECT FROM tb1
I|

改动思路

提到的左边界和右边界可以参考 dt-sql-parser #231 的描述。

通过分隔符进行切分（通常是 ;），这里依旧保留现状寻找最小合适范围的策略，并在此策略上继续优化，借助两种方式进一步缩小解析范围。

在已经获取到的合适范围中以光标为起点，向左查找 ; 的 tokenIndex，并以此为左边界；
在已经获取到的合适范围中以光标为起点，向右查找 ; 的 tokenIndex，并以此为右边界；

通常在写 SQL 时，一般不会先写当前语句的 ;，所以右边界一般不会再次改变。如果左右没有查找到 ; 则不修改左右边界。

实现效果

liuxy0551 · 2024-07-31T09:01:56Z

过程中遇到的问题

尝试过以独立语句开头的关键词（如：SELECT, INSERT）进行切分，遇到了一些问题：

pg 的个别语法

REVOKE SELECT (co_name) ON table_name |FROM PUBLIC;

GRANT SELECT (column_name) ON table_name TO |role_specification;

MERGE INTO wines w USING wine_stock_changes s ON s.winename = w.winename 
WHEN NOT MATCHED AND stock_delta > 0 
THEN INSERT (col_name) |VALUES(s.winename, s.stock_delta);

WITH with_query_name (col_name) AS (SELECT id FROM table_expression) SEARCH DEPTH 
FIRST BY column_name SET column_name 
CYCLE col_name SET col_name 
USING col_name SELECT|;

上述语法中的 SELECT ON 等和常规独立语句不同，不属于一个语句，此时切分得到的语句依旧无法进行正确的自动补全。

子查询（复杂）

SELECT c.customer_id, c.customer_name, c.email, total_orders.total_amount, total_orders.order_count
FROM customers c
JOIN (
  SELECT o.customer_id, SUM(o.total_amount) AS total_amount, COUNT(o.order_id) AS order_count
  FROM orders o
  WHERE o.order_date BETWEEN '2024-08-01' AND '2024-08-31'
  GROUP BY o.customer_id
  HAVING COUNT(o.order_id) > 5
) AS total_orders
ON c.customer_id = total_orders.customer_id
WHERE| c.status = 'active'
ORDER BY total_orders.total_amount DESC;

上述语句中存在多层级的子查询，此时如果在子查询后出现光标，且光标位置和子查询不是同一层级，那么会出现较为明显的切分错误，结果如下，连子查询的括号都不完整，更不谈正确进行自动补全了。

  SELECT o.customer_id, SUM(o.total_amount) AS total_amount, COUNT(o.order_id) AS order_count
  FROM orders o
  WHERE o.order_date BETWEEN '2024-08-01' AND '2024-08-31'
  GROUP BY o.customer_id
  HAVING COUNT(o.order_id) > 5
) AS total_orders
ON c.customer_id = total_orders.customer_id
WHERE|

因此，放弃通过以独立语句开头的关键词进行切分，仅通过分隔符进行切分（通常是 ;）。

liuxy0551 · 2024-09-29T09:24:41Z

已发 beta 包在离线中验证效果符合预期，dt-sql-parser@4.1.0-beta.2, monaco-sql-languages@0.12.3-beta.3

src/parser/flink/index.ts

src/parser/common/basicSQL.ts

openai0229 · 2024-12-06T07:59:21Z

所以左右边界都是采取分号来做划分吗？

mumiao · 2025-03-28T02:25:15Z

有冲突

liuxy0551 · 2025-03-28T02:49:02Z

有冲突

和 #378 部分设计重合，待重新验证功能

liuxy0551 · 2025-04-03T03:36:54Z

有冲突

和 #378 部分设计重合，待重新验证功能

@JackWang032 将 getMinimumParserInfo 方法的内容拆分为 getMinimumInputInfo 和 parserWithNewInput，作用分别是获取最小解析边界和重新解析新的 inputSlice。便于后续通过 ; 再次切分 input 并解析

具体改动在 bd83147

JackWang032 · 2025-04-03T06:37:54Z

src/parser/common/basicSQL.ts

+     * @param input source string
+     * @returns parse and parserTree
+     */
+    private parserWithNewInput(inputSlice: string) {


这个方法对标的是createParserWithCache，我觉得方法名可以改下，其次只返回parserIns是不是更好点？由有具体方法决定何时去生成解析树

这个方法对标的是createParserWithCache，我觉得方法名可以改下，其次只返回parserIns是不是更好点？由有具体方法决定何时去生成解析树

方法名的话有什么建议吗，这个方法更多的是 createParserWithCache 和 parseWithCache 的结合，所以同时返回了 parserIns 和 parserTree，我觉得还有 parserWithInput、parserWithInputSlice 这些可选

我倾向于直接使用已有的createParser，两者功能重叠了

createParser

推了一个新的 commit，使用已有的 createParser 方法获取 parserIns 即可

JackWang032 · 2025-04-07T07:10:46Z

+1

* feat: improve errorListener msg (#281) * feat: add mysql errorListener and commonErrorListener * feat: improve other sql error msg * feat: support i18n for error msg * feat: add all sql errorMsg unit test * feat: update locale file and change i18n funtion name * test: upate error unit test * feat(flinksql): collect comment, type attribute for entity (#319) * feat(flinksql): collect comment, type attribute for entity * feat(flinksql): delete console log * fix(#305): delete function ctxToWord,using ctxToText instead of ctxToWord * feat: update attribute's type * feat(flinksql): update flinksql's entitycollect unit test * feat: optimize interface and update unit test * feat: update collect attr detail * feat: optimize interface and some function's arguments * feat: add comment and update params' name * feat: collect alias in select statement * feat: update collect attribute function and update unit test --------- Co-authored-by: zhaoge <> * fix: spell check (#337) Co-authored-by: liuyi <liuyi@dtstack.com> * ci: check-types and test unit update * feat: collect entity's attribute(#333) * feat(trinosql): collect trino sql's attribute(comment,alias,colType) * feat(hivesql): collect hive sql's attribute(comment,alias,colType) * feat(impalasql): collect attribute(comment, colType, alias) * feat(sparksql): collect entity's attribute (comment,alias, colType) * feat: update endContextList of collect attribute * feat(postgresql): collect hive sql's attribute(alias,colType) * feat: update interface of attrInfo and alter entitycollect ts file * feat(mysql): collect entity's attribute(comment,colType,alias) * ci: fix check-types problem --------- Co-authored-by: zhaoge <> * chore(release): 4.1.0-beta.0 * fix: #362 set hiveVar value (#369) * fix: #371 export EntityContext types (#372) * fix: minimum collect candidates boundary to fix parse performance (#378) * fix: minimum collect candidates boundary to fix parse performance * fix: fix check-types * fix: remove debugger code * fix(flink): fix flinksql syntax error about ROW and function using (#383) Co-authored-by: zhaoge <> * build: pnpm antlr4 --lang all * Feat/follow keywords (#407) * feat: provide follow keywords when get suggestions * chore: add watch script * refactor: optimize spark grammar (#360) * feat: support semantic context of isNewStatement (#361) * feat: support semantic context of isStatamentBeginning * docs: add docs for semantic context * feat: unify variables in lexer (#366) * feat: unify variables in lexer * fix: all sql use WHITE_SPACE * feat: complete after error syntax (#334) * refactor: split getMinimumParserInfo to slice input and parser again * test: complete after error syntax * feat: complete after error syntax * feat: use createParser to get parserIns and remove parserWithNewInput * feat(all sql): add all sql expression column (#358) * feat(impala): add impala expression column * feat(trino): add expression column * feat(hive): add hive expression column * feat(spark): add spark expression column * feat(mysql): add mysql expression column unit test * feat(flink): add flink expression column * feat(postgresql): add pg expression column * feat: #410 optimize processCandidates tokenIndexOffset (#411) * test: test suggestion wordRanges with range when processCandidates without tokenIndexOffset * feat: #410 optimize processCandidates tokenIndexOffset --------- Co-authored-by: 霜序 <976060700@qq.com> Co-authored-by: XCynthia <942884029@qq.com> Co-authored-by: 琉易 <liuxy0551@qq.com> Co-authored-by: liuyi <liuyi@dtstack.com> Co-authored-by: zhaoge <> Co-authored-by: Hayden <hayden9653@gmail.com> Co-authored-by: JackWang032 <64318393+JackWang032@users.noreply.github.com> Co-authored-by: JackWang032 <2522134117@qq.com>

liuxy0551 force-pushed the feat_complete branch 2 times, most recently from f6b1a3d to 56e4d0d Compare July 30, 2024 16:41

HaydenOrz force-pushed the next branch from 1d9dc1c to 8b1ba06 Compare August 2, 2024 03:02

liuxy0551 force-pushed the feat_complete branch 2 times, most recently from b610cb3 to d841e3c Compare August 26, 2024 08:54

HaydenOrz force-pushed the next branch from 7dc1d46 to a3b6b7e Compare August 27, 2024 02:17

liuxy0551 force-pushed the feat_complete branch 8 times, most recently from 944a97f to 417c063 Compare September 27, 2024 07:57

liuxy0551 requested review from HaydenOrz, Cythia828 and LuckyFBB September 27, 2024 07:58

liuxy0551 marked this pull request as ready for review September 27, 2024 07:58

liuxy0551 changed the title ~~test: complete after error syntax~~ feat: complete after error syntax Sep 27, 2024

HaydenOrz reviewed Oct 15, 2024

View reviewed changes

src/parser/flink/index.ts Outdated Show resolved Hide resolved

src/parser/common/basicSQL.ts Outdated Show resolved Hide resolved

liuxy0551 force-pushed the feat_complete branch 2 times, most recently from 9ce9722 to 6cebe8e Compare October 15, 2024 14:14

HaydenOrz force-pushed the next branch from dc2acf9 to 2358d95 Compare October 17, 2024 11:23

HaydenOrz reviewed Oct 22, 2024

View reviewed changes

src/parser/common/basicSQL.ts Outdated Show resolved Hide resolved

liuxy0551 force-pushed the feat_complete branch from 6cebe8e to c031b24 Compare October 22, 2024 08:40

liuxy0551 force-pushed the feat_complete branch from c031b24 to f2b72ed Compare November 7, 2024 10:58

liuxy0551 force-pushed the next branch from c232aaa to ba9e3d6 Compare November 14, 2024 04:00

liuxy0551 mentioned this pull request Nov 15, 2024

如何根据SQL语句上下文得知别名与实体的关系 DTStack/monaco-sql-languages#151

Closed

liuxy0551 force-pushed the feat_complete branch from f2b72ed to de3b760 Compare November 19, 2024 02:47

liuxy0551 mentioned this pull request Nov 28, 2024

这个提示好像有点怪怪的，应该和语法文件的实现有关 DTStack/monaco-sql-languages#161

Open

liuxy0551 mentioned this pull request Dec 18, 2024

自动补全失效 DTStack/monaco-sql-languages#166

Open

liuxy0551 force-pushed the feat_complete branch from de3b760 to f35b86a Compare April 2, 2025 11:48

liuxy0551 mentioned this pull request Apr 3, 2025

feat: #410 optimize processCandidates tokenIndexOffset #411

Merged

liuxy0551 force-pushed the feat_complete branch from f35b86a to d8b2456 Compare April 3, 2025 03:31

liuxy0551 and others added 3 commits April 3, 2025 11:33

refactor: split getMinimumParserInfo to slice input and parser again

bd83147

test: complete after error syntax

593dbed

feat: complete after error syntax

3915739

liuxy0551 force-pushed the feat_complete branch from d8b2456 to 3915739 Compare April 3, 2025 03:33

liuxy0551 requested a review from JackWang032 April 3, 2025 03:37

JackWang032 reviewed Apr 3, 2025

View reviewed changes

feat: use createParser to get parserIns and remove parserWithNewInput

46a51ec

LuckyFBB added the Next Version label Apr 8, 2025

mumiao approved these changes May 8, 2025

View reviewed changes

mumiao merged commit 99b01e5 into DTStack:next May 8, 2025
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: complete after error syntax #334

feat: complete after error syntax #334

liuxy0551 commented Jul 25, 2024 •

edited

Loading

liuxy0551 commented Jul 31, 2024 •

edited

Loading

liuxy0551 commented Sep 29, 2024

openai0229 commented Dec 6, 2024

mumiao commented Mar 28, 2025

liuxy0551 commented Mar 28, 2025

liuxy0551 commented Apr 3, 2025

JackWang032 Apr 3, 2025

liuxy0551 Apr 7, 2025

JackWang032 Apr 7, 2025

liuxy0551 Apr 7, 2025

JackWang032 commented Apr 7, 2025

feat: complete after error syntax #334

feat: complete after error syntax #334

Conversation

liuxy0551 commented Jul 25, 2024 • edited Loading

在错误语法的 SQL 后进行自动补全

现状举例

预期举例

改动思路

实现效果

liuxy0551 commented Jul 31, 2024 • edited Loading

过程中遇到的问题

liuxy0551 commented Sep 29, 2024

openai0229 commented Dec 6, 2024

mumiao commented Mar 28, 2025

liuxy0551 commented Mar 28, 2025

liuxy0551 commented Apr 3, 2025

JackWang032 Apr 3, 2025

Choose a reason for hiding this comment

liuxy0551 Apr 7, 2025

Choose a reason for hiding this comment

JackWang032 Apr 7, 2025

Choose a reason for hiding this comment

liuxy0551 Apr 7, 2025

Choose a reason for hiding this comment

JackWang032 commented Apr 7, 2025

liuxy0551 commented Jul 25, 2024 •

edited

Loading

liuxy0551 commented Jul 31, 2024 •

edited

Loading