Skip to content

sqlite: add trace sql hook#62241

Open
araujogui wants to merge 7 commits intonodejs:mainfrom
araujogui:sqlite-verbose
Open

sqlite: add trace sql hook#62241
araujogui wants to merge 7 commits intonodejs:mainfrom
araujogui:sqlite-verbose

Conversation

@araujogui
Copy link
Member

@araujogui araujogui commented Mar 13, 2026

Add DatabaseSync trace sql hook for logging and debug

@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/sqlite

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. labels Mar 13, 2026
@louwers
Copy link
Contributor

louwers commented Mar 13, 2026

@araujogui Maybe setSqlTraceHook? This is a bit more explicit and makes it easier to match it up with the corresponding SQLite documentation / functionality.

@codecov
Copy link

codecov bot commented Mar 13, 2026

Codecov Report

❌ Patch coverage is 74.07407% with 7 lines in your changes missing coverage. Please review.
βœ… Project coverage is 89.67%. Comparing base (da5843b) to head (83627f5).
⚠️ Report is 12 commits behind head on main.

Files with missing lines Patch % Lines
src/node_sqlite.cc 74.07% 3 Missing and 4 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #62241      +/-   ##
==========================================
- Coverage   89.67%   89.67%   -0.01%     
==========================================
  Files         676      676              
  Lines      206469   206582     +113     
  Branches    39537    39562      +25     
==========================================
+ Hits       185157   185244      +87     
- Misses      13448    13465      +17     
- Partials     7864     7873       +9     
Files with missing lines Coverage Ξ”
src/node_sqlite.h 80.64% <ΓΈ> (ΓΈ)
src/node_sqlite.cc 80.73% <74.07%> (-0.08%) ⬇️

... and 40 files with indirect coverage changes

πŸš€ New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • πŸ“¦ JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@araujogui araujogui requested a review from louwers March 13, 2026 18:30
@araujogui araujogui changed the title sqlite: add verbose option sqlite: add trace sql hook Mar 13, 2026
@Renegade334
Copy link
Member

Should we make this a diagnostics channel instead?

@araujogui
Copy link
Member Author

araujogui commented Mar 14, 2026

Should we make this a diagnostics channel instead?

Sounds a good idea to me

CC @nodejs/sqlite @nodejs/diagnostics

@Qard
Copy link
Member

Qard commented Mar 15, 2026

Yeah, this seems very much like a thing diagnostics_channel would be more suitable for. There's also a native-side publishing API now, so you can use that.

@louwers
Copy link
Contributor

louwers commented Mar 15, 2026

What is the advantage of using a diagnostic channel here?

What if you have an application with 1000s of DatabaseSync objects and you're only interested in seeing the SQL queries of one particular instance? How would you even differentiate between them? What if you need to do different things with the traces of different database handles (e.g. send them to different logging backends)?

Do diagnostic channels offer that much flexibility?

Note that there are other things that sqlite3_trace_v2 can trace. I think it would be good to eventually support all of them.

@Qard
Copy link
Member

Qard commented Mar 16, 2026

The sqlite trace interfaces just gives you some data at the C layer. You still need to figure out how to deliver that efficiently to JS. The diagnostics_channel API is designed to be as close as possible to zero-cost when nothing is listening for a given event, and to be as cheap as possible when something is. It also allows decoupling through named channels so you don't have to explicitly pass around your objects to the things which want to observe them, which makes it much easier for Observability tools to see them. Why reinvent all this machinery to get your data from C to JavaScript when we already have a thing specifically for this purpose?

Copy link
Member

@Qard Qard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM, but if at all possible I'd like to see more data than just the final SQL string. It'd be helpful to be able to better correlate where exactly it came from and what the original inputs were.

if (retval.IsEmpty()) {
db->SetIgnoreNextSQLiteError(true);
}
ch->Publish(env, sql_string);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to get the query template and values as they were prior to generating the final SQL string? If at all possible, I think it would be valuable for Observability purposes to publish an event which includes: database instance, query template, query parameters, and final query string.

@louwers
Copy link
Contributor

louwers commented Mar 16, 2026

Because, as I understand it, it is not scoped per database connection. It's all or nothing.

If you have 1000s of concurrent database connections, there's no way to trace only a single of them with this implementation. You will have to deal with a deluge of traces from all of them once you enable the diagnostics channel, with no way to differentiate between messages.

@Qard
Copy link
Member

Qard commented Mar 16, 2026

You could just put the instance in the published message so you can check if it matches whichever instance you care about. That's what is typically done with diagnostics_channel. It's meant to be a bag of relevant data that the subscriber can look at to extract as much or as little detail as it needs. Some cases might need to correlate with specific instances, some might not, so they can pick and choose how much of the data they do anything with.

And if you really wanted events to only be published for a specific instance, though there's no specific recommendation or precedent for it, you could simply not publish the events from a given instance until some particular flag on that instance has been enabled. That'd move the control of making the data available from the consumer to the producer though, which may not be desirable.

Copy link
Contributor

@louwers louwers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a real performance impact of registering a trace callback with sqlite3_trace_v2. It should not be done for all databases indiscriminately.

sqlite/sqlite-prepare-select-get-options.js, 30 runs (no listeners for diagnostics channel):

options='returnArrays' statement='SELECT                               β–ˆβ–ˆ| -1.2%
options='readBigInts                                                  β–ˆβ–ˆβ–ˆ| -1.4%
statement='SELECT * FROM foo LIMIT 1' ta                              β–ˆβ–ˆβ–ˆ| -1.7%
statement='SELECT text_column, integer_c                             β–ˆβ–ˆβ–ˆβ–ˆ| -2.0%
statement='SELECT text_8kb_column FROM f                            β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| -2.3%
statement='SELECT text_column, integer_c                           β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| -2.8%
options='none' statement='SELECT * FROM                            β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| -2.8%
statement='SELECT text_column, integer_c                           β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| -3.0%
options='readBigInts' statement='SELECT                           β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| -3.3%
statement='SELECT text_column FROM foo L                         β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| -3.8%
statement='SELECT 1' tableSeedSize=10000                      β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| -5.6%

Using a diagnostics channel is probably OK as long as there is some way to find the corresponding database for all messages. Right now this is not possible which is problematic.

My philosophy is that the SQLite API of Node.js should mirror that of SQLite itself where it makes sense. Being able to register callbacks directly on database handles matches SQLite more closely. It is also more ergonomic in my opinion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants