This is from the SQLite creator D. Richard Hipp who is always worth reading, but, I'd like to recommend reading what TCL's creator John Ousterhout has to say.
His article on threads from 1995 was highly influential on me, and I remember it to this day. More recently (2018, revised and expanded in 2021), he published a book on software engineering practices called A Philosophy of Software Design which is, in my opinion, the best in its category.
+100 for his book. TIL he created TCL also. The book is excellent and one of the very resources that talks about architecture independent of language or tech-stack used.
Like Lamport, who is more widely known for (what was originally) his side project LaTeX than for his seminal distributed systems research including Paxos, Ousterhout tends to be more widely known by name for (what was originally) his side project Tcl than for his seminal distributed systems research including Raft.
> Early versions of SQLite (prior to 2004) operated on the classic TCL principal that "everything is a string". Beginning with SQLite3 (2004-06-18), SQLite also supports binary data.
> However, types are still very flexible in SQLite, just as they are in TCL. SQLite treats the datatypes on column names in a CREATE TABLE statement as suggestions rather than hard requirements
This is something I do not much like. Its not compulsory (you can create "strict" tables). It works well in TCL which is an entire language designed around the idea. Less so in SQL.
One of the advantages of RDBMSes is that not accepting obviously wrong data makes life easier for developers. You can debug an issue that happens on inserting the data, not when you find the wrong type or other bad much later on.
This typing behaviour, now in combination with strict tables, is a boon for biostats. When you get shitty data to be cleaned, you've got 3 main choices: 1.use slow and untyped scripting languages, 2.use a strictly typed database, meaning you'll have to clean your data in advance, or 3.load it all as strings into SQLlite, then clean the data until it fits into a strict table with check constraints. IMO, it's pretty clear 3 is best by far!
I agree that 3 is great, but you can also do that in any database, just create your input tables as string only and then perform the necessary operations to move them into typed tables.
Yes, but with sqlite there is much less ceremony (no server, etc.) and most importantly can be used without talking to my institution's sysadmin, which is what I'm looking for when manipulating one-off datasets.
Back in the day, the Tcl conferences (and the EuroTcl in Europe) were great sources of information. Only about half of the presentations were for Tcl internals and extensions. The rest were about other projects that were using Tcl in some way or other, and it was fascinating to learn about completely different areas of software.
> For example, the byte-code engine used to evaluate SQL statements inside of SQLite is implemented as a large "switch" statement inside a "for" loop, with a separate "case" for each opcode, all in the "vdbe.c" source file.
This is from the SQLite creator D. Richard Hipp who is always worth reading, but, I'd like to recommend reading what TCL's creator John Ousterhout has to say.
His article on threads from 1995 was highly influential on me, and I remember it to this day. More recently (2018, revised and expanded in 2021), he published a book on software engineering practices called A Philosophy of Software Design which is, in my opinion, the best in its category.
+100 for his book. TIL he created TCL also. The book is excellent and one of the very resources that talks about architecture independent of language or tech-stack used.
He also gave a talk that greatly influenced how I look at relationships: https://gist.github.com/gtallen1187/27a585fcf36d6e657db2
Like Lamport, who is more widely known for (what was originally) his side project LaTeX than for his seminal distributed systems research including Paxos, Ousterhout tends to be more widely known by name for (what was originally) his side project Tcl than for his seminal distributed systems research including Raft.
What an amusing coincidence, I didn't know Lamport wrote LaTeX, rather I knew of him only in connection with Lamport clocks.
> Early versions of SQLite (prior to 2004) operated on the classic TCL principal that "everything is a string". Beginning with SQLite3 (2004-06-18), SQLite also supports binary data.
TCL can handle binary data. It is just not a separate type: https://wiki.tcl-lang.org/page/Working+with+binary+data
SQLite also always had a null type, surely?
> However, types are still very flexible in SQLite, just as they are in TCL. SQLite treats the datatypes on column names in a CREATE TABLE statement as suggestions rather than hard requirements
This is something I do not much like. Its not compulsory (you can create "strict" tables). It works well in TCL which is an entire language designed around the idea. Less so in SQL.
One of the advantages of RDBMSes is that not accepting obviously wrong data makes life easier for developers. You can debug an issue that happens on inserting the data, not when you find the wrong type or other bad much later on.
There was a time when TCL could not handle binary data at all. The support came in TCL 8.0:
"Binary data is now supported in Tcl."
https://www.tcl.tk/software/tcltk/relnotes/tcl8.0.txt
Well before 2004, but worth mentioning because you'll find a fair amount of old posts complaining about it.
This typing behaviour, now in combination with strict tables, is a boon for biostats. When you get shitty data to be cleaned, you've got 3 main choices: 1.use slow and untyped scripting languages, 2.use a strictly typed database, meaning you'll have to clean your data in advance, or 3.load it all as strings into SQLlite, then clean the data until it fits into a strict table with check constraints. IMO, it's pretty clear 3 is best by far!
I agree that 3 is great, but you can also do that in any database, just create your input tables as string only and then perform the necessary operations to move them into typed tables.
Yes, but with sqlite there is much less ceremony (no server, etc.) and most importantly can be used without talking to my institution's sysadmin, which is what I'm looking for when manipulating one-off datasets.
Back in the day, the Tcl conferences (and the EuroTcl in Europe) were great sources of information. Only about half of the presentations were for Tcl internals and extensions. The rest were about other projects that were using Tcl in some way or other, and it was fascinating to learn about completely different areas of software.
They were. I attended for many years, including the conference where Richard gave this talk.
I wonder -- if they were to restart from scratch today, would they do the same thing? If not, which stack would they choose?
Anyone has used the mysterious "e" editor?
There's a manual and it influenced Stallman in the creation of Emacs so some people sure did.
http://i.stanford.edu/pub/cstr/reports/cs/tr/80/796/CS-TR-80...
That's not the E editor he talked about. It is the editor created by Richard Hipp himself. You can find a reference here:
https://wiki.tcl-lang.org/page/Tcl+Editors
And you can find a version here:
http://grumbeer.dyndns.org/ftp/cdroms/freebsd/freebsd-2.2.1-...
No, that's Todd Squires's e93 editor, the next entry on the wiki :)
That's a treasure!
> For example, the byte-code engine used to evaluate SQL statements inside of SQLite is implemented as a large "switch" statement inside a "for" loop, with a separate "case" for each opcode, all in the "vdbe.c" source file.
Duff's Device[1] for the win!
[1] https://en.m.wikipedia.org/wiki/Duff's_device
Duff's Device is a for loop inside a switch statement. Totally different.