Here is my rather naive take on the same subject. But I had a very different motivation than the author. See I actually quite like SQL and enjoy programming in it, but what I don't like is mixing sql and python. So one night in a flash of inspiration or perhaps a fever dream I wrote this thing that lets you have stand alone parameterized sql queries and you call them like a python function or generator. It is one of those overly clever things where I sort of hate the magic, but I find myself using it more and more which I will probably regret one day.
This is a nice package, and a great illustration of how languages other than R suffer from the lack of an aesthetically elegant way to select list elements with bare words, like R's $ operator.
Because their lists don't have selection by bare words, they have to go one of several other specialized, distinct, built-in Abstract Data Types to get it. They have to create whole so-called "Classes" and "Modules", when all they really needed was a list whose elements can be accessed with a dot and a bare word.
The pandas package for tabular data manipulation requires even more complicated workarounds. It has a DataFrame Class composed of objects of Column Class. Then it makes an arbitrary bunch of common functions, so common that many are built into Python itself, Methods of said Columns. (In R, a table is just a list of vectors, and no Methods are needed.)
So now you've got a thing that's supposedly a real Class, but it's really just a container of completely arbitrary fields and data types. These fields are themselves instances of another Class that is supposedly specific to pandas, but is really just a vector, and a vector doesn't necessarily have anything to do with being part of a table. And that Class has some random methods that give you additional ways to do basic things the language already does, and are often not the functions you actually need to work with the data therein.
All that just so that we can write stuff like df.col.max(), and... gosh, what is that even supposed to mean? Can we all just admit that we like writing code in chains separated by dots, and stop tying that capability to hierarchies of Official Abstract Data Types?
These non-R languages make you utter such strange incantations just to put something in a key-value container and access that thing with nice-looking code. I feel like this makes it harder to realize that very often this is the best way of doing things.
R has a bit more varied and sometimes mildly ugly syntax than other languages, but once you get used to the building blocks it gives you, it has all these powers to do very dynamic things in very easy ways, without a bunch of ponderous specialized concepts.
I honestly have no opinion in this discussion, but I will 100% upvote the first Fossil repository I've seen shared on here! How do you find developing on the Fossil platform?
Really like the idea of keeping SQL explicit while still getting ORM conveniences. The @sqlfunc syntax feels clean, and the no-session approach makes cross DB work simpler. Curious how it handles async or pooling. Seems like a solid middle ground between raw SQL and heavy ORMs.
A common problem I found myself in is that I have to develop the query in one file and frequently run it to verify the data accuracy. I define the variables at top reminding me types and limitations etc. int vs varchar(10) vs varchar(50)
So I made a very simple module that takes those sql files and turns them into SQLAlchemy text objects with variables in them.
Would it be possible to add something like this to the project or does it require many sql parsing libraries etc. to ensure sql validity to find variables in the sql file?
> However, I've always felt some of the design choices didn't fit how I like to use an ORM. Notably:
I feel the same, hence why I prefer a Django-like ORM to SQLAlchemy in spite of all the praises it gets. The author says "SQLAlchemy is the best. I don't like the API or codebase of the others", but actually what he describes feels like the Django ORM (or Tortoise, or many others).
Also, sometimes just a thin layer above SQL is fine. For small personal projects I use my own wrapper above sqlite like so:
import oora
from dataclasses import dataclass
db = oora.DB(
db_path=":memory:", # or /path/to/your/db.sqlite3
# migrations are just pairs of key=>val where key is an arbitrary (but unique) label and val is a SQL script or a callable.
# If val is a callable, it must take a sqlite3.Cursor as first parameter.
# migrations are executed in order
migrations={
# here's an initial migration:
"0000": "CREATE TABLE IF NOT EXISTS user(id INTEGER PRIMARY KEY, name TEXT UNIQUE NOT NULL);",
# simulating a schema evolution, let's add a field:
"0001": "ALTER TABLE user ADD COLUMN email TEXT NULL;",
},
)
db.migrate()
db.insert("user", {"name": "John"})
db.insert("user", {"name": "Jack"})
db.insert("user", {"name": "Jill"})
# dataclasses are perfect to represent rows
# while still allowing custom behaviour
@dataclass
class User:
id: int
name: str
email: str
def __str__(self):
return self.name
# fetch a random instance
user = db.hydrate(User, db.execute("select * from user ORDER BY RANDOM() limit 1").fetchone())
print(f"User(id {user.id}), original name: {user}")
# change name and email
user.name = "Richard"
user.email = "richard@acme.tld"
db.save(user) # name of table is infered from the dataclass name
print(f"User(id {user.id}), updated name: {user} <{user.email}>")
# persist changes
db.commit()import oora
from dataclasses import dataclass
db = oora.DB(
db_path=":memory:", # or /path/to/your/db.sqlite3
# migrations are just pairs of key=>val where key is an arbitrary (but unique) label and val is a SQL script or a callable.
# If val is a callable, it must take a sqlite3.Cursor as first parameter.
# migrations are executed in order
migrations={
# here's an initial migration:
"0000": "CREATE TABLE IF NOT EXISTS user(id INTEGER PRIMARY KEY, name TEXT UNIQUE NOT NULL);",
# simulating a schema evolution, let's add a field:
"0001": "ALTER TABLE user ADD COLUMN email TEXT NULL;",
},
)
db.migrate()
db.insert("user", {"name": "John"})
db.insert("user", {"name": "Jack"})
db.insert("user", {"name": "Jill"})
# dataclasses are perfect to represent rows
# while still allowing custom behaviour
@dataclass
class User:
id: int
name: str
email: str
def __str__(self):
return self.name
# fetch a random instance
user = db.hydrate(User, db.execute("select * from user ORDER BY RANDOM() limit 1").fetchone())
print(f"User(id {user.id}), original name: {user}")
# change name and email
user.name = "Richard"
user.email = "richard@acme.tld"
db.save(user) # name of table is infered from the dataclass name
print(f"User(id {user.id}), updated name: {user} <{user.email}>")
# persist changes
db.commit()
Here is my rather naive take on the same subject. But I had a very different motivation than the author. See I actually quite like SQL and enjoy programming in it, but what I don't like is mixing sql and python. So one night in a flash of inspiration or perhaps a fever dream I wrote this thing that lets you have stand alone parameterized sql queries and you call them like a python function or generator. It is one of those overly clever things where I sort of hate the magic, but I find myself using it more and more which I will probably regret one day.
https://nl1.outband.net/fossil/query/file?name=query.py&ci=t...
In short you have your query in file sql/dept_personal.sql and you call it like
You may enjoy https://pugsql.org
This is a nice package, and a great illustration of how languages other than R suffer from the lack of an aesthetically elegant way to select list elements with bare words, like R's $ operator.
Because their lists don't have selection by bare words, they have to go one of several other specialized, distinct, built-in Abstract Data Types to get it. They have to create whole so-called "Classes" and "Modules", when all they really needed was a list whose elements can be accessed with a dot and a bare word.
The pandas package for tabular data manipulation requires even more complicated workarounds. It has a DataFrame Class composed of objects of Column Class. Then it makes an arbitrary bunch of common functions, so common that many are built into Python itself, Methods of said Columns. (In R, a table is just a list of vectors, and no Methods are needed.)
So now you've got a thing that's supposedly a real Class, but it's really just a container of completely arbitrary fields and data types. These fields are themselves instances of another Class that is supposedly specific to pandas, but is really just a vector, and a vector doesn't necessarily have anything to do with being part of a table. And that Class has some random methods that give you additional ways to do basic things the language already does, and are often not the functions you actually need to work with the data therein.
All that just so that we can write stuff like df.col.max(), and... gosh, what is that even supposed to mean? Can we all just admit that we like writing code in chains separated by dots, and stop tying that capability to hierarchies of Official Abstract Data Types?
These non-R languages make you utter such strange incantations just to put something in a key-value container and access that thing with nice-looking code. I feel like this makes it harder to realize that very often this is the best way of doing things.
R has a bit more varied and sometimes mildly ugly syntax than other languages, but once you get used to the building blocks it gives you, it has all these powers to do very dynamic things in very easy ways, without a bunch of ponderous specialized concepts.
Correction, the Column object is called Series, but the same points apply.
I honestly have no opinion in this discussion, but I will 100% upvote the first Fossil repository I've seen shared on here! How do you find developing on the Fossil platform?
Kudos for making the leap.
Your pattern of re-interpreting __doc__ is kinda weird though. Why not just add a `return` statement?
Really like the idea of keeping SQL explicit while still getting ORM conveniences. The @sqlfunc syntax feels clean, and the no-session approach makes cross DB work simpler. Curious how it handles async or pooling. Seems like a solid middle ground between raw SQL and heavy ORMs.
Agreed. I have a TypeScript library that attempts something similar: https://jawj.github.io/zapatos/
A common problem I found myself in is that I have to develop the query in one file and frequently run it to verify the data accuracy. I define the variables at top reminding me types and limitations etc. int vs varchar(10) vs varchar(50)
So I made a very simple module that takes those sql files and turns them into SQLAlchemy text objects with variables in them.
Would it be possible to add something like this to the project or does it require many sql parsing libraries etc. to ensure sql validity to find variables in the sql file?
> However, I've always felt some of the design choices didn't fit how I like to use an ORM. Notably:
I feel the same, hence why I prefer a Django-like ORM to SQLAlchemy in spite of all the praises it gets. The author says "SQLAlchemy is the best. I don't like the API or codebase of the others", but actually what he describes feels like the Django ORM (or Tortoise, or many others).
Also, sometimes just a thin layer above SQL is fine. For small personal projects I use my own wrapper above sqlite like so:
I think this is very similar to Django’s ORM.