One task where GeoJSON falls down is simplification of a group of polygons with common boundaries, e.g. the 48 conterminous US states. If you start with a highly detailed set of polygons, you need to simplify them for practical display in an online map.
GeoJSON doesn't encode the fact that the boundary points are common between adjacent polygons. When you simplify those polygons, each one is handled separately, and you end up with "slivers" where the boundaries are misaligned:
TopoJSON solves this by encoding each such boundary only once. So when you simplify the polygons, they are all done together, and the same simplification applies to adjacent polygons. No more slivers!
Is this actually GeoJSON falling down, or decades of convention extended to JSON? Topology is great, but it is sidestepped by Shapefile/WKT/WKB/etc, in favor of independent primitives like POINT, LINE, POLYGON. If GeoJSON did not exist as a new JSON GIS data format encoding these primitives, TopoJSON would not have "replaced" it, due to the added mis-match with other non-topological formats.
From what I can tell, the top criticism of GeoJSON is the under-enforced winding order specification, and crossing the antemeridian.
I’ve applied GeoJSON (among many other GIS tech) for mapping and monitoring tens of thousands of warehouse robots. It works great as long as you squint just a bit, ignoring that it generally calls for long,lat and is designed with the assumption of a world CRS.
The dangerous part is that some tools fully assume this and will completely screw with calculations if you’re assuming a flatland CRS. So you’ve got to be careful in checking and setting those parameters.
One nice thing is that the structure of GeoJSON works incredibly well in typescript. It has discriminated unions built in so you can walk entire geodatasets in a pretty comfortable way.
> It works great as long as you squint just a bit, ignoring that it generally calls for long,lat and is designed with the assumption of a world CRS.
I thought the spec allowed you to specify the CRS, but I just checked the RFC and they removed that from the 2016 specification and WGS84 is specified. It does allow for alternative CRS with prior arrangement, but like you said that does require a lot of care.
I’ve had nothing but problems using GeoJson. The specification has limitations everywhere and doesn’t even support z + m values at the same time.
But thankfully there is also the SQLite backed GeoPackage, which is not only more flexible but also much smaller. It takes some extra steps to get testing teams working due to it’s binary nature, but other than that it is the best format in geospatial data analysis.
We used this extensively when I worked in this space (2010 - 2014). My favorite addition was using https://github.com/topojson/topojson to add arcs. That cut down on quite a bit of points to represent curves.
Dang, fun memories of when I was first getting in to geo/data stuff and doing a lot of web mapping stuff with D3, Leaflet and friends. Seems as tools like Vector tiles/PMTiles have supplanted topojson for a lot of visualization oriented use cases.
GeoJSON is super useful. At Getcho (delivery, logistics) we use zip code GeoJSON encodings to draw polygons on zone maps and quickly generate rates. This has been a persistently annoying thing to do until we discovered this format. If you're curious, someone made a repo with all the 2010 census zips a while back [0].
About 25% of ZIP codes don't have a corresponding Census Bureau ZCTA, for example 10118. Do you end up needing special handling for those cases? Or has it not yet come up in practice?
Excellent question it certainly does come up. Practically speaking the more populous zip codes are all accounted for and that’s where the vast majority of deliveries go to. For example I took the census zip code data 150 miles (crow flies) outside Philly and found virtually 100% coverage.
For missing ones you have to fall back to distance based estimates and in my business that means you’re quote may be off and you’re exposed
No shade whatsoever at you or your business: I'll say upfront that you certainly made the right practical decision for the goal of running a business.
That said, this is a textbook example of what I have always found so infuriating, personally, about working on commercial software, and one of the many reasons I ultimately moved into a non-software-writing role. The (very sensible and practical) shortcuts and tradeoffs that are commonly made due to time and cost constraints. The attitude of "well the vast majority of our use cases work, so we're done." I've always thought edge cases must be addressed. Something in my brain hurts when I knowingly release something where only 99% of cases work.
I can imagine this is probably the same thing some artists feel when they are commissioned to produce (in their view rushed, flawed, or incomplete) artwork for business purposes.
I only write software at home, as a hobby now, and this gives me the outlet to follow my heart around edge cases!
Have been using GeoJSON, very handy and human-readable, but we recently switched to GeoPackage files, as it allows for different layers, each with a different schema for additional data.
GeoPackages also allow to set a proper CRS, which is not as easy in GeoJSON IIRC.
GeoJSON is not just for geographical features! Shapes of any kind work just as well.
QuPath[1], a tool for digital pathology whole slide image analysis, can export annotations in GeoJSON format (and import too I suppose).[2] This makes it really very easy to make annotations transportable between tooling.
And with PostgREST [0], you can automatically convert any PostGIS table (with geometry or geography column) to GeoJSON by using an "Accept: application/geo+json" header in the request.
I vibe coded something similar (different data source) with codex that went something like SQLite->GeoJSON->Leaflet and it was a dream - almost no corrections necessary. It even went off and found a really nice colour scheme for me.
This is nice. I haven't worked with GIS data for ages but I really like the idea of a well-understood plain text container for it. Much nicer than wrangling with binary formats like shapefiles, especially when something goes wrong and you're not sure if it's your code (well more precisely your usage of whatever library you've got for it) or the data.
JSON parsing is probably one of the most thoroughly optimized subsystems in the whole industry at this point. Obviously there are ways to encode the same data that are easier to parse (e.g. instead of absolute floating point coordinates, use integer deltas along a path in some reasonable CRS) but because this inefficient representation is so common and so long-standing the parsing is faster than people think.
the default parsers all load the entire thing in mem, which is not good.
so you need a stream-based parser, which nobody does an effort to write/use for json. especially since geojson is a web format, and people just default to json.parse, which is blocking. and even then, even if you did use the custom one, it likely won't be a geojson-tailored one, so because key-order isn't guaranteed, any parser for geo-json will need to do some acrobatics to finding the reference-system, dealing with arbitrarily nested geometries etc..
it's a good format for what it is, but it's not a great geo-format. a geo format needs to be easily scannable and, even better, have a geometry index to be able to seek quickly.
Nope, properties must be an object (dictionary or null). Which means each property can only appear once.
The spec doesn't say what type the value of a property can be, though. Examples in the RFC show strings, floats, and a nested object. So you could probably put a list in there as well if you want to store multiple values under the same key, provided that your decoder knows what to do with such values. (GeoJSON is often converted to and from WKB/WKT, and unorthodox values may be lost in the conversion.)
Recently I got into cartography software for a bit and the horrors of the data formats in this industry are real. Feels like everyone under the sun has their own.
All that said, GeoJSON was a great change of pace, I enjoyed using it. While I'm no professional and have no idea what the professional needs are, it was very good for my hobbyist needs.
One task where GeoJSON falls down is simplification of a group of polygons with common boundaries, e.g. the 48 conterminous US states. If you start with a highly detailed set of polygons, you need to simplify them for practical display in an online map.
GeoJSON doesn't encode the fact that the boundary points are common between adjacent polygons. When you simplify those polygons, each one is handled separately, and you end up with "slivers" where the boundaries are misaligned:
https://www.bing.com/images/search?q=map+slivers+betwen+poly...
TopoJSON solves this by encoding each such boundary only once. So when you simplify the polygons, they are all done together, and the same simplification applies to adjacent polygons. No more slivers!
https://github.com/topojson/topojson
https://github.com/topojson/topojson-simplify
Is this actually GeoJSON falling down, or decades of convention extended to JSON? Topology is great, but it is sidestepped by Shapefile/WKT/WKB/etc, in favor of independent primitives like POINT, LINE, POLYGON. If GeoJSON did not exist as a new JSON GIS data format encoding these primitives, TopoJSON would not have "replaced" it, due to the added mis-match with other non-topological formats.
From what I can tell, the top criticism of GeoJSON is the under-enforced winding order specification, and crossing the antemeridian.
Right. Encoding a union algorithm into the data structure just introduces the reverse problem: Selecting a subset now requires extra logic beyond jq.
I’ve applied GeoJSON (among many other GIS tech) for mapping and monitoring tens of thousands of warehouse robots. It works great as long as you squint just a bit, ignoring that it generally calls for long,lat and is designed with the assumption of a world CRS.
The dangerous part is that some tools fully assume this and will completely screw with calculations if you’re assuming a flatland CRS. So you’ve got to be careful in checking and setting those parameters.
One nice thing is that the structure of GeoJSON works incredibly well in typescript. It has discriminated unions built in so you can walk entire geodatasets in a pretty comfortable way.
> It works great as long as you squint just a bit, ignoring that it generally calls for long,lat and is designed with the assumption of a world CRS.
I thought the spec allowed you to specify the CRS, but I just checked the RFC and they removed that from the 2016 specification and WGS84 is specified. It does allow for alternative CRS with prior arrangement, but like you said that does require a lot of care.
> tens of thousands of warehouse robots
Sounds like Amazon
Definitely not Amazon. Yuck.
I’ve had nothing but problems using GeoJson. The specification has limitations everywhere and doesn’t even support z + m values at the same time.
But thankfully there is also the SQLite backed GeoPackage, which is not only more flexible but also much smaller. It takes some extra steps to get testing teams working due to it’s binary nature, but other than that it is the best format in geospatial data analysis.
Long live SQLite!
Made by Sean Gillies and a few others. Back when mapbox was doing all sorts of great open source stuff. Legends
https://github.com/sgillies
We used this extensively when I worked in this space (2010 - 2014). My favorite addition was using https://github.com/topojson/topojson to add arcs. That cut down on quite a bit of points to represent curves.
Dang, fun memories of when I was first getting in to geo/data stuff and doing a lot of web mapping stuff with D3, Leaflet and friends. Seems as tools like Vector tiles/PMTiles have supplanted topojson for a lot of visualization oriented use cases.
GeoJSON is super useful. At Getcho (delivery, logistics) we use zip code GeoJSON encodings to draw polygons on zone maps and quickly generate rates. This has been a persistently annoying thing to do until we discovered this format. If you're curious, someone made a repo with all the 2010 census zips a while back [0].
[0] https://github.com/OpenDataDE/State-zip-code-GeoJSON/blob/ma... although you can generate newer versions from the last census.
About 25% of ZIP codes don't have a corresponding Census Bureau ZCTA, for example 10118. Do you end up needing special handling for those cases? Or has it not yet come up in practice?
Excellent question it certainly does come up. Practically speaking the more populous zip codes are all accounted for and that’s where the vast majority of deliveries go to. For example I took the census zip code data 150 miles (crow flies) outside Philly and found virtually 100% coverage.
For missing ones you have to fall back to distance based estimates and in my business that means you’re quote may be off and you’re exposed
No shade whatsoever at you or your business: I'll say upfront that you certainly made the right practical decision for the goal of running a business.
That said, this is a textbook example of what I have always found so infuriating, personally, about working on commercial software, and one of the many reasons I ultimately moved into a non-software-writing role. The (very sensible and practical) shortcuts and tradeoffs that are commonly made due to time and cost constraints. The attitude of "well the vast majority of our use cases work, so we're done." I've always thought edge cases must be addressed. Something in my brain hurts when I knowingly release something where only 99% of cases work.
I can imagine this is probably the same thing some artists feel when they are commissioned to produce (in their view rushed, flawed, or incomplete) artwork for business purposes.
I only write software at home, as a hobby now, and this gives me the outlet to follow my heart around edge cases!
[delayed]
Have been using GeoJSON, very handy and human-readable, but we recently switched to GeoPackage files, as it allows for different layers, each with a different schema for additional data.
GeoPackages also allow to set a proper CRS, which is not as easy in GeoJSON IIRC.
Getting your CRSes wrong is fun...
GeoJSON is not just for geographical features! Shapes of any kind work just as well.
QuPath[1], a tool for digital pathology whole slide image analysis, can export annotations in GeoJSON format (and import too I suppose).[2] This makes it really very easy to make annotations transportable between tooling.
[1] https://qupath.github.io/
[2] https://github.com/qupath/qupath-docs/blob/main/docs/advance...
Also https://postgis.net/
And with PostgREST [0], you can automatically convert any PostGIS table (with geometry or geography column) to GeoJSON by using an "Accept: application/geo+json" header in the request.
[0] https://docs.postgrest.org/en/v14/how-tos/working-with-postg...
Also https://github.com/timescale/timescaledb
I've found it very useful for storing geospatial data over time.
I love GeoJSON :) You can bring any Geo/GIS from 0 to visualization by just parsing it into GeoJSON.
geojson.io is a great editor/viewer by Mapbox. Also https://kepler.gl/demo is great for additional filtering, visualizations like heatmaps, arcs etc.
A extension to GeoJSON that works with JSONL-like semantics would be great for huge files, but this could also be solved by tiling.
Dunno whose website this is, but the format itself is great, and it allows for a relatively compact and relatively human-readable presentation.
A few weeks ago I (vibe)coded mxmap.be and if not for the ubiquity of geojson, it would have taken me significantly more effort.
What a great project.
I vibe coded something similar (different data source) with codex that went something like SQLite->GeoJSON->Leaflet and it was a dream - almost no corrections necessary. It even went off and found a really nice colour scheme for me.
Nice work with mxmap. It's a very good way to appreciate to what extent EU depends on US - email providers being just one of several dimensions.
This is nice. I haven't worked with GIS data for ages but I really like the idea of a well-understood plain text container for it. Much nicer than wrangling with binary formats like shapefiles, especially when something goes wrong and you're not sure if it's your code (well more precisely your usage of whatever library you've got for it) or the data.
Looks like what any sensible dev would come up with if asked to "return this geo data as json". I like simple!
nice and simple, great. but because it's json, most parsers are horribly inefficient, which is tough, because a lot of geodata is massive.
JSON parsing is probably one of the most thoroughly optimized subsystems in the whole industry at this point. Obviously there are ways to encode the same data that are easier to parse (e.g. instead of absolute floating point coordinates, use integer deltas along a path in some reasonable CRS) but because this inefficient representation is so common and so long-standing the parsing is faster than people think.
the default parsers all load the entire thing in mem, which is not good.
so you need a stream-based parser, which nobody does an effort to write/use for json. especially since geojson is a web format, and people just default to json.parse, which is blocking. and even then, even if you did use the custom one, it likely won't be a geojson-tailored one, so because key-order isn't guaranteed, any parser for geo-json will need to do some acrobatics to finding the reference-system, dealing with arbitrarily nested geometries etc..
it's a good format for what it is, but it's not a great geo-format. a geo format needs to be easily scannable and, even better, have a geometry index to be able to seek quickly.
The properties key is plural but contains a dictionary. Does the schema allow for this to be a list?
Nope, properties must be an object (dictionary or null). Which means each property can only appear once.
The spec doesn't say what type the value of a property can be, though. Examples in the RFC show strings, floats, and a nested object. So you could probably put a list in there as well if you want to store multiple values under the same key, provided that your decoder knows what to do with such values. (GeoJSON is often converted to and from WKB/WKT, and unorthodox values may be lost in the conversion.)
Also, JSON! Wow.
vega-lite supports rendering of GeoJSON via 'geoshape'
https://vega.github.io/vega-lite/docs/geoshape.html
Recently I got into cartography software for a bit and the horrors of the data formats in this industry are real. Feels like everyone under the sun has their own.
All that said, GeoJSON was a great change of pace, I enjoyed using it. While I'm no professional and have no idea what the professional needs are, it was very good for my hobbyist needs.