Brush – A new compatible Gaussian splatting engine

(github.com)

178 points | by Tycho87 8 months ago ago

58 comments

Groxx 8 months ago

A request for everyone writing docs with content like this:

>NOTE: This only works on desktop Chrome 129+ currently. Firefox and Safari are hopefully [supported soon](link), but currently even firefox nightly and safari technical preview do not work.

This is great, especially with that link! Thank you! But please say when "currently" is, e.g. add an "(Oct 2024)". Stuff like this tends to be time-sensitive on accuracy but not consistently updated and is often years out of date with no easy way for visitors to tell.

And when it's recent, it also tells people that the project is active.

[-]

porphyra 8 months ago

Also, webgpu isn't enabled on Chrome for Linux by default currently. You'll need

> The chrome://flags/#enable-unsafe-webgpu flag must be enabled (not enable-webgpu-developer-features). Linux experimental support also requires launching the browser with --enable-features=Vulkan.

https://github.com/gpuweb/gpuweb/wiki/Implementation-Status#...

steve_adams_86 8 months ago

I’m realizing this might not cross some developers minds because the current time in that context is attached to the commit when it was added or changed.

I do this quite often. I probably shouldn’t, though. It’s only useful if you’re looking at commit logs or have an inline ‘last changed by [author] on [date]’ helper in your IDE.

Then again, even that could be made wrong by future edits.

[-]

Groxx 8 months ago

Yeah, there are frequently ways to figure out what date a relative measure is referring to; it's one of the best things about version control being a true norm in this field.

But it's a few extra steps (depending on the UI), and many will not take those steps. They'll just trust it (far beyond when it's relevant), or think "that's probably old" and doubt it (immediately, because old docs are so common).

It's relatively minor, but it's extremely easy to prevent, and just a better habit when communicating with the future.

[-]

codeflo 8 months ago

And that’s assuming the repository never gets reorganized in any way that doesn’t perfectly preserve history — which over long timespans, is bound to happen at some point.

Ameo 8 months ago

Wow - the in-browser demo (https://arthurbrussee.github.io/brush-demo/) runs way more performantly and renders much better-looking results than any other I'd tried in the past.

It loaded my 50MB .ply file almost instantly. Orbiting around the scene is extremely smooth and everything is free of flickering or artifacts.

I never tried out training a Gaussian splat from images/video myself before, but this tool makes me want to give it a go.

[-]

ArthurBrussee 8 months ago

Love to hear it!! Most viewers take some shortcuts, like only sorting every so often, it's good to hear the difference is noticable :)

Training a splat requires a lot less setup with this, but does still require running COLMAP(https://github.com/colmap/colmap) first, which is still a big barrier... one thing at a time!

[-]

dsp_person 8 months ago

Feel the same as GP here.

How expensive is the COLMAP step to run? I was also really impressed with the speed in the demo (but thinking that the shown training was the only step)

Could you ELI5 what the training is versus what the COLMAP part is?

[-]

ArthurBrussee 8 months ago

The input to this are two things - images, and camare poses. The camera poses tell you where each camera was in 3D space (and some of its properties).

The training takes this information, to make a 3D model out it, visually matching all your photos.

COLMAP can still be quite expensive & a hassle sadly, order half hour, as opposed to seconds. There are modern alternatives like https://lpanaf.github.io/eccv24_glomap/, or even deep learning based systems like https://github.com/naver/dust3r

This is definitely still a big blocker to adoption. The goal is to get to a more all-in-one system. The splatting optimization can also help align cameras, if they don't start out entirely random, so any system to quickly provide a good "initial guess" will help here. At least for mobile devices, initialization from ARCore / ARKit poses should be enough.

Keep an eye out :)

[-]

bob112 8 months ago

If you're capturing on a mobile device, why not use Scaniverse? It's about as all-in-one as it gets - you just scan and it'll generate a .ply after a minute or two of processing. They'll host the splat for you in the cloud if you want too.

[-]

jaredkrauss 8 months ago

For me, at least, I want to own all my data, and not give any away without explicit permission. So, even in the case of Scaniverse, I'm reluctant.

But I'm just an artist trying to read and learn, and haven't yet gotten around to actually figuring out how to do all this on my Macbook Pro M1 yet ^-^

wellthisisgreat 8 months ago

how do you get the .ply file to load into it? which software do you use to generate those files?

[-]

Ameo 8 months ago

I used an app on my phone called Scaniverse

I'm sure there are others as well

rallyforthesun 8 months ago

Thank you for releasing this. It is the first option afaik, to generate a 3D Gaussian on a Mac without a gpu (using M1 Pro). It is quite slow, but quick enough to test-train a dataset while onsite, without the need to carry heavy workstations around! I really like the option to use rerun.io for training analytics. Again, thank you.

[-]

ArthurBrussee 8 months ago

You're welcome and thank you for trying it out!

Hard at work to make performance better - the "main" kernels are at least as fast as gSplat, so now need to remove other overheads.

That, and make splatting train more efficiently in general, lots of compute is wasted on small steps.

Ps: the web version takes a minute to warm up and is generally slower, do try a native version if you haven't yet!

[-]

rallyforthesun 8 months ago

Thanks for the advice, i did compile the repo on my m1 using vs code, but i do compare the speed to my workstation RTX4090, that comparison is not appropriate.

throwaway2562 8 months ago

What are splats actually useful for, and where are they used?

[-]

eurekin 8 months ago

Corridor Channel had one great example: https://youtu.be/GaGcLhhhbDs?si=vDyeayLf8EAoE0gf&t=442

Above includes the explanation. Final result is here:

https://youtu.be/GaGcLhhhbDs?si=eoTniegWK-AVFoaF&t=751

twelvechairs 8 months ago

Making a movable 3d scene from limited initial information. Some pros and cons against a traditional '3d model' approach. Pros - faster/simpler to generate (especially with lots of data), better at dealing with light and reflection. Cons - Having 3d geomtry can be useful eg. for collision detection, volumetric understanding, surface alteration/deformation etc.

Not much widespread use right now - Possible commercial use cases are things like real estate walkthroughs and maybe replacing a google street view with something more interactive.

[-]

lioeters 8 months ago

Excellent summary, thanks!

ArthurBrussee 8 months ago

Its really the latest incarnation in the field of Photogrammetry https://en.wikipedia.org/wiki/Photogrammetry - aka, converting 2D images / video to 3D data.

Imagine one of those house tours on Zoopla on steroids, or street view but smoother.

two_handfuls 8 months ago

Splats are good for generating new images from an existing place even if no photograph exists from that exact viewpoint.

They can be used for video special effects, for 3D images/video, and for VR. The technology is nascent but shows promise.

peej555 8 months ago

Whats cool about this is the visualisation gives you a good intuition on how the training is working.

Sometimes i feel like it should be able to get more details in certain areas but its always looking at things holistically.

I wish you could give it a 3d bounding box and say - "work on this area only" which i think is something that should be possible?

[-]

ArthurBrussee 8 months ago

That was in some way the original motivation for the project!

I think if you are reconstructing your own data the algorithm better just work, without input, ideally.

But, imagine you could add in generated videos. Lay down a camera path, tell it what to generate, and add it to the reconstruciton. A brush stroke one might say ;)

DarmokJalad1701 8 months ago

Gaussian splatting using Burn has been on my side project list for a while now. I guess they beat me to it! :)

t43562 8 months ago

Sorry for the dumb question: What are the inputs? Photos? Videos? Any other data?

And then, what's the output?

Otherwise I find the whole website far too "involved" to understand what it's doing at all. Someone who already understands the area won't have my trouble of course.

[-]

ArthurBrussee 8 months ago

Not a dumb question! This first version is still mainly targetted at people who are in this area and generate some excitement, I do hope to make this more accesible though!

The inputs are 1. images 2. with a pose. The usual way to get poses for your images is https://github.com/colmap/colmap.

The output is a 3D model. Specifically a "Gaussian Splat", which is a sort of fuzzy point cloud. There are some tools out there to view & edit these (besides Brush), eg. https://playcanvas.com/supersplat/editor.

[-]

andybak 8 months ago

Considering pose generation is often the slowest part - any plans to tackle this? There's quite a few papers that claim to do away with COLMAP.

[-]

ArthurBrussee 8 months ago

Yes :) But it will take a while!

WhatIsDukkha 8 months ago

One of the things thats held me back from being super interested in this field is that my understanding is that there is likely to be some kind of mesh backing needed for this to progress.

IIRC some researchers had started to back the gaussians with a mesh to provide an editable artifact that would allow the gaussians to be moved and manipulated.

Is this anywhere near being a standard feature yet?

edit - ie https://arxiv.org/abs/2402.04796

aDyslecticCrow 8 months ago

I expect this tech to make big waves in the backend of robotic systems soon. Accurate SLAM with accurate semantic tagging of objects is a big deal. An efficient and accurate reality-to-simulator translation, allowing reinforcement learning in simulated environments to be directly applicable to reality.

Begone lidar units for basic robot tasks! All praise, normal cameras! (though, its far to slow to run on autonomous cars, since the environment changes so rapidly)

[-]

porphyra 8 months ago

Accurate SLAM with accurate semantic tagging would be a big deal, yes, but this project still relies on pretrained data with COLMAP so how is that relevant to your comment?

[-]

aDyslecticCrow 8 months ago

I'm not saying you can use THIS method for a SLAM; I'm saying you can use this METHOD for a slam.

You don't need a perfect COLMAP for this method (well, not this in particular, but for this method more broadly with some modifications); you just need an approximate location for a few of the images to start and then match the others progressively... which is literally what SLAM is all about.

And "pre-trained data" makes no sense. It's trained, as in slowly chewing iteratively on the data before getting decent 3D space, but that just means it's a bit slow. Hence, my mention of simple robots that move in a semi-fixed environment rather than being unusable for self-driving.

But more broadly, it's a method to describe the real-world appearance 3d space, which may have computational and flexibility advantages over massive point clouds.

neom 8 months ago

Is this kinda tech eventually going to sometimes sub in for compression and/or codecs? Feel like it could be kinda applicable to streaming?

kreelman 8 months ago

This is great. Thanks so much for putting this together. It works on a laptop with a so-so graphics card... But it's the first time I've ever been able to process a Gaussian splat myself.

jtrueb 8 months ago

async_std is a nonstandard choice these days, no? I assume this is related to the style of blocking spawned work?

[-]

ArthurBrussee 8 months ago

I'm a bit out of the loop on async runtimes. I know Tokio is of course the big on, but, that seemed much to heavy to just run some tasks, and isn't very WASM compatbile afaik.

Otherwise there's smol, and maybe others? Would love to hear what a good web WASM compatible async framework is nowadays!

efilife 8 months ago

What's it compatible with?

[-]

ArthurBrussee 8 months ago

Devices & operating systems! Windows/Mac/Linux, AMD / nvidia / built-in GPUs, Android/iOs, or running in a browser context.

My bad, I really bungled the original tweet the title is from :)

IshKebab 8 months ago

Does it do the SfM step?

[-]

rallyforthesun 8 months ago

no, it expects a zip file with already aligned images (for training) or the pointcloud itself (for viewing)

fzy95 8 months ago

No you still need to run COLMAP

[-]

rallyforthesun 8 months ago

you could also use Metashape

[-]

andybak 8 months ago

Or Reality Capture which I think can do it for free?

[-]

rallyforthesun 8 months ago

In metashape, you can export the cameras in the colmap format, in RC you might have to convert first to Kapture and so on, afaik.

ArthurBrussee 8 months ago

Yeah this is the next big challenge. There's some ideas what to do, but one step at a time!

brcmthrowaway 8 months ago

Does polycam suport splatting

[-]

albumen 8 months ago

Yes.

Alifatisk 8 months ago

Super compatible?

[-]

dang 8 months ago

We've made it less super and also deinrusted it in the title above.

(submitted title was "Brush – a new super compatible Gaussian splatting engine in Rust")

[-]

rc00 8 months ago

> deinrusted

Out of curiosity, what is the motivation or policy here? This feels like a change in stance.

[-]

Groxx 8 months ago

To place a barely-educated guess: because "... In Rust" is enough of a trope around here that it brings very specific crowds of people out to argue the same points each time, whether it's even remotely relevant to the link or not.

Better to just avoid it unless the "in rust" part is somehow intrinsically relevant (e.g. it's in rust for specific reasons that were previously too hard in other languages)

[-]

rc00 8 months ago

There is still brigading that happens regardless. I understand trying to minimize the battles but when there are actors propping up these same posts, it ends up having the opposite effect. (There are coordinated audiences on Discord and Mastodon looking to swarm these posts and game things like the front page of HN.)

Better to suppress these posts if the desire is to avoid the inevitable arguments with the bonus that it can be automated rather than requiring manual intervention. Otherwise, it likely ends up a well-intentioned but poisonous pill. I don't know if this will stop the most motivated members though.

[-]

Groxx 8 months ago

Part "yes, you have a point" and part "this is letting perfect be the enemy of good".

There is apparently enough time and energy for manual intervention, given that it just happened - if it isn't making things worse, it may still be worth doing. Particularly since brigading tends to move in temporary bursts.

[-]

rc00 8 months ago

You would consider daily a temporary burst? ;)

[-]

Groxx 8 months ago

It has abated quite significantly from its peaks, yes.

(I say as a user, which are the target of these actions, so I think it's pretty relevant that at least some feel that way)

dang 8 months ago

Rust is well enough established at this point that titles don't need the extra juice.

ArthurBrussee 8 months ago

I think the title is taken from my original tweet which I really bungled, my bad :) Hopefully the readme does a better job!