Super cool. Working on a local AI tool specifically for document workflow automation (where context = screen/web/folders/files), and this could come in super useful. I do most of the PDF/DOCX/etc. parsing natively in Rust, but having a nice way to see the output without spinning up Word or Powerpoint is a huge leap.
Using lopdf[1] for PDF parsing, rtf-parser[2] for RTF, calamine[3] for XLSX, and I'm sure you know that DOCX/PPTX/etc. is basically just a zip file of XML + text. The LLM cares about textual data (which just gets moderately cleaned up post-extraction), so I (thankfully) didn't have to deal with rendering. But showing a preview or end-result to a user would be a huge plus, so I can see myself using your library.
By quirk of fate i've spent the past 2 days prototyping some stuff on pdfjs. Just trying to figure out a game plan for handling bounding boxes in the face of page zooming, different resolutions etc. etc. I can't see it mentioned whether the components are virtualising pages (as in reusing dom elements as document pages scroll by). I guess i just learned what i'll be exploring tomorrow then...
_Try_ to? Are you serious? We're not taking ambiguity due to phrasing, we're talking straight up not mentioning it on either the front page, on the show all page or even on the page the components button takes you to. Not even on this forum thread. There's barely any text on the front page to change and there are even multiple places you can mention React. You could mention it in the title, e.g. "Open source _React_ UI kit for modern document apps" or you could add it to the second paragraph, e.g. "React components ready to drop into user-facing flows, agents, or internal tools." Not to mention the components page. And given what you use to make the website, how does it take this long to update something this trivial when you are going out of your way to promote it, replying to comments and your other services by proxy?
Also, add either user interactive loading of components or lazy load the demos, the amount of demos murder performance on phones.
How is your PDF coverage? They are notoriously difficult things to render, with endless edge cases.
Mozilla’s PSD.js is the status quo here, so what do you do better than them?
Cool project! I was playing around with the Excel viewer - the docs claim "Search across sheets and cell ranges", but I can't seem to trigger search functionality and the browser search bar can't find contents on cells.
thats fair and definitely something we can try supporting in the future. we started with React because of how familiar models are with shadcn and tailwind
we hope this can be useful for people building in React though!
if you mean our docx viewer/editor specifically its hard to say without manually testing the visual fidelity with Word on some complex docx files
you are welcome to try it with your own documents and see, but its just one example we wanted to show. for the blocks that use the react-docx library, you can always copy the code and use a different method to render the docx file/thumbnails
Thanks, that looks awesome! We were looking to add DOCX and XLSX preview to our app, and were planning to do server-side conversion to PDF (which seems to be what most other apps resort to) due to the lack of good libraries to render it, and this is exactly what we were looking for! :)
Looks clean and works fine, but it needs optimization. Clicking "Type" in the "schema builder" example takes 1~2 seconds to open the popover in the landing page(macbook pro m4). I think its because there are lots of heavy components, but still it's too slow.
Hi! I'm one of the engineers at Extend that worked on this - one of our other engineers created a Rust XLSX/XLS parser that we ported to WASM to our @extend-ai/react-xlsx package which handles the rendering/charts. It exposes some hooks so users can use their own components for the toolbar
Excellent that you offer Miller columns, one of the best tools for computing and information browsing, and management. The world should run on Miller columns.
Looks cool but your home page heavily lags on my mbp m3 pro - you should maybe be lazy loading vs loading all your components upfront
Valid feedback a few other comments have mentioned , I’m going to try improving the home page tomorrow
The performance of the "Layout blocks" component is particularly bad. It consumes tons of CPU when scrolling.
Super cool. Working on a local AI tool specifically for document workflow automation (where context = screen/web/folders/files), and this could come in super useful. I do most of the PDF/DOCX/etc. parsing natively in Rust, but having a nice way to see the output without spinning up Word or Powerpoint is a huge leap.
Thanks for releasing publicly.
nice - did you write a custom parser for PDF/DOCX? we wrote one for XLSX after running into event loop issues with sheet JS
Using lopdf[1] for PDF parsing, rtf-parser[2] for RTF, calamine[3] for XLSX, and I'm sure you know that DOCX/PPTX/etc. is basically just a zip file of XML + text. The LLM cares about textual data (which just gets moderately cleaned up post-extraction), so I (thankfully) didn't have to deal with rendering. But showing a preview or end-result to a user would be a huge plus, so I can see myself using your library.
[1] https://github.com/J-F-Liu/lopdf
[2] https://github.com/d0rianb/rtf-parser
[3] https://github.com/tafia/calamine
What about rendering? That's the hard part.
we built a library @extend-ai/react-xlsx on top of it that renders the parsed contents onto a canvas
testing was mostly manual with a test corpus we generated. its not perfect but its pretty close for most files we've seen
For me, rendering was just a nice-to-have.
Sorry I meant to ask the author of Extend UI not you.
Those bounding box demos are decent.
By quirk of fate i've spent the past 2 days prototyping some stuff on pdfjs. Just trying to figure out a game plan for handling bounding boxes in the face of page zooming, different resolutions etc. etc. I can't see it mentioned whether the components are virtualising pages (as in reusing dom elements as document pages scroll by). I guess i just learned what i'll be exploring tomorrow then...
yes - the pdf/docx viewer use react-virtual to virtualize the pages
the zoom should work with the bounding box highlights, we're working on adding rotation support
Why doesn't it mention anywhere that they are React components?
we'll try to make it more clear on the landing/introduction page!
_Try_ to? Are you serious? We're not taking ambiguity due to phrasing, we're talking straight up not mentioning it on either the front page, on the show all page or even on the page the components button takes you to. Not even on this forum thread. There's barely any text on the front page to change and there are even multiple places you can mention React. You could mention it in the title, e.g. "Open source _React_ UI kit for modern document apps" or you could add it to the second paragraph, e.g. "React components ready to drop into user-facing flows, agents, or internal tools." Not to mention the components page. And given what you use to make the website, how does it take this long to update something this trivial when you are going out of your way to promote it, replying to comments and your other services by proxy?
Also, add either user interactive loading of components or lazy load the demos, the amount of demos murder performance on phones.
coming back to this now and added it in a few places
on the demos - everything below the fold is lazy loaded but i will see what we can do to improve the mobile perf
Tone.
How is your PDF coverage? They are notoriously difficult things to render, with endless edge cases. Mozilla’s PSD.js is the status quo here, so what do you do better than them?
We use react-pdf for the viewer which I believe uses pdf JS under the hood
We aren’t not trying to reinvent that engine, rather just provide a building block for people to plug in their design system to its controls
Cool project! I was playing around with the Excel viewer - the docs claim "Search across sheets and cell ranges", but I can't seem to trigger search functionality and the browser search bar can't find contents on cells.
Is this a known issue?
These should really be web components. Leaving out every framework other than React is really bad for the web.
Then build it yourself. They do work for free, give it to the world, and your response is: do it differently the way I want.
let's all try to be nicer
thats fair and definitely something we can try supporting in the future. we started with React because of how familiar models are with shadcn and tailwind
we hope this can be useful for people building in React though!
How does it compare to https://news.ycombinator.com/item?id=48436863 ?
if you mean our docx viewer/editor specifically its hard to say without manually testing the visual fidelity with Word on some complex docx files
you are welcome to try it with your own documents and see, but its just one example we wanted to show. for the blocks that use the react-docx library, you can always copy the code and use a different method to render the docx file/thumbnails
Thanks, that looks awesome! We were looking to add DOCX and XLSX preview to our app, and were planning to do server-side conversion to PDF (which seems to be what most other apps resort to) due to the lack of good libraries to render it, and this is exactly what we were looking for! :)
thanks! would love to get your feedback
i can't promise its visually 1:1 with Word/Excel but its pretty close on the corpus we tested with
This is really interesting. Thanks for creating this.
Looks clean and works fine, but it needs optimization. Clicking "Type" in the "schema builder" example takes 1~2 seconds to open the popover in the landing page(macbook pro m4). I think its because there are lots of heavy components, but still it's too slow.
it should be much faster on the individual component's page
the root page is a bit slow with all the viewers, in practice you probably wont have that many in your app on one page
really like these - curious how the xlsx editor and viewer is built in that what kind of headless spreadsheet?
could not have been easy
Hi! I'm one of the engineers at Extend that worked on this - one of our other engineers created a Rust XLSX/XLS parser that we ported to WASM to our @extend-ai/react-xlsx package which handles the rendering/charts. It exposes some hooks so users can use their own components for the toolbar
How much was actual engineering and how much was telling an AI what to do?
Even if it was just prompting, not sure how that takes away from the final, fairly polished, product. How do you define "actual engineering"?
Excellent that you offer Miller columns, one of the best tools for computing and information browsing, and management. The world should run on Miller columns.
TIL those are called Miller columns !
Does it/will it support Markdown files?
i felt like we couldn't build much on top of react-markdown, which i think is what most people are using
> This page could not load
On mobile Safari…
hm seems to look ok on mine - is this on the root page?