I agree complex conditional LangGraph setups get pretty tedious after a certain point - though you claim to not use graphs here, but isn't that essentially what returning the "next task" does? The graph isn't explicitly defined but it still exists implicitly if you trace through all the tasks and next tasks.
Would be interesting to see a complex agent implementation in both Flow and regular LangGraph to compare maintainability.
In some sense yes, task dependencies do form a graph. However, what I meant when I was saying that graph is a wrong abstraction, is that predefined edges is a not the best of writing code when dealing with dynamic systems such as AI Agents.
As you said, the easiest conditional workflows are tedious and we have to rely on "conditional" edges to solve the simplest stuff. Then, how would you define cycles or spawning of the same task multiple times as a node-edge relation? It becomes extremely inconvenient and just gets in your way. The reason I built Flow is exactly because I previously relied on a predefined node-edge system which I also built (https://docs.lmnr.ai/pipeline/introduction).
Also it impossible to do MapReduce like operations on node-edge without inventing some special node for handling that stuff.
Idea about comparing LangGraph and Flow is really good and I will work on that asap! What kind of workflows would you love to see implemented in Flow? Btw, there're a lot of examples in the readme and in the /tests folder
The challenge with this approach though is that you need to run the actual code to see what it does, or as a developer build up a mental model of the code ... but it does shine in certain use cases -- and also reminds me of https://github.com/insitro/redun because it takes this approach too.
I have a meta question though - there seems to be a huge amount of activity in this space of LLM agent related developer tooling. Are people actually successfully and reliably delivering services which use LLMs? (Meaning services which are not just themselves exposing LLMs or chat style interfaces).
It seems like 18 months ago people were running around terrified, but now almost the opposite.
I once had a dream to be a writer. I made a tool that was going to help me write every day. The most insightful feedback I got was: "Wow, aspiring writers will do anything to avoid writing!"
I think something similar applies here: you see way more dev tools than successful products because people can't build many successful products using LLMs. Building a devtool (the old-fashioned way) is something that people can tangibly do.
My main company is Laminar (https://www.lmnr.ai) and we actually help folks ship reliable LLM software to prod. We've seen many of our clients successfully do that. Although, the entire space is very new for sure and things are changing everyday
What kind of stuff are people doing? I understand if you don't want to be too specific, but do you have any interesting (vague) examples using agents that aren't in the chat bot space?
Folks are still getting funded for these sorts of ideas (LLM-based agents) but until I see a single working proof of concept, I'm convinced they're all doomed.
To not fall under ,what is criticized by some people here, as making an ad for a competitor - but big parts of it are very similar to a project im working on for multiple years now, i not gonne name or link the project i gonne mention.
While my project is not meant to be specifically used with LLMs, what i build (or am building) is a system which has no specific defined sequence, but is rather a composition of "actions" of which each of them has a specific defined requirement in form of a data structure that is necessary to execute the "action". I build it in a way to be self supervising without one single task scheduling mechanism to enable people to write data driven applications.
Its nice to see that im not the only one (ye i didnt expect that to be the case dont worry im not elon-musk crazy) that tries to go such a way.
While i build it for completly different use cases (when i started LLM weren't such a big thing as they are now) its definatly a cool and creative way to use such an architecture.
I kinda disagree. This is a discussion forum first, advertising should just be a side effect. Anyone posting here should do with an intention to spark discussions and elicit greater wisdom of the community. And if that wisdom happens to be comparisons to prior art, so be it. I don't think it should take away anything from the work, but only expand both the author's as well as the community's awareness on the topic.
1. Deadlocks
2. Programmer Experience
3. Updating the code with in-flight tasks
To avoid deadlocks, it seems like the executor should need to know what the dependencies are before attempting execute tasks. If we have 4 threads of execution, it seems like we could get into a state where all 4 of our threads are blocking on semaphores waiting for another task to provide a dependent value. And at scale, if it can happen, it definitely will happen eventually.
Potentially related-- it could make sense for the engine to give preference to completing partially completed tasks before starting new fresh tasks.
Also, I wonder if there's a way to lay the specification of tasks out so it looks more like normal code-- potentially with async await. What I mean by normal code is this: It's much more natural to write a program with 2 steps like this:
If I have 7 steps with some loops and conditionals, it becomes much more difficult to grok the structure with the disjointed task structure, and much more difficult to restructure and reorganize it. I wonder if there's a way to pull it off where the distributed task structure is still there but the code feels more natural. Using this framework we're essentially writing the code in a fragmented way that's challenging to reorganize and reason about.
What will happen when we change the task code and deploy a new version; I wonder what happens to the tasks that were in flight at the moment we deploy?
Thank you for your comment and wanted to add some clarifications.
1. tasks are not explicitly called from another task
In your example greet() is never called, instead task with id=greet will be pushed to the queue
2. The reason I opted for distributed task approach is precisely to eliminate
await task_1
await task_2
...
Going to the point 1, task_2 just says to the engine, ok buddy, now it is time to spawn task_2. With that semantics we isolate tasks and don't deal with the outer tasks which calls another tasks. Also, parallel task execution is extremely simply with that approach.
3. Deadlocks will happen iff you will wait for the data that is never assigned, which is expected. Otherwise, with the design of state and engine itself, they will never happen.
4. For your last point, I would argue the opposite is true, it's actually much harder to maintain and add new changes when you hardcode everything, hence why this project exists in the first place.
5. Regarding deployment. Flow is not a temporal-like (yet), everything is in-memory and but I will def look into how to make it more robust
Agreed. It seems like this has nothing to do with agents. In particular because this is well trodden territory that frameworks like Airflow have been tackling since _well_ before the recent deep learning craze.
I would say that unless agent based examples are highly compelling, it makes more sense to simply remove the agent stuff completely from the pitch, lest you be inevitably accused of taking advantage of the current AI hype for an otherwise unrelated piece of technology (that happens to be a good fit for agent based workflows - something which I haven't observed to even work very well with the best models).
For me Agent is essentially a decision making machine. I had many iterations on the software around building agents and Flow is the culmination of all of the learnings.
For some reason some LLM specific examples just slipped of my mind because I really wanted to show the barebone nature of this engine and how powerful it is despite its simplicity.
But you also right, it's general enough that you can build any task based system or rebuild complex system with task architecture with Flow.
Signal is clear, add more agent specific examples.
It supports tasks, dynamic routes and parallel execution in pure Python build-ins(zero deps). But just a side project so no persistent stuff and just a easy tool.
Totally makes, don't know why I missed agent specific examples... will add them asap.
I focused on the Agent because it was initially built for that internally. But you are right, for the release I made it extremely general and to be used as a foundation for the complex systems.
I'm intrigued by the ability to start execution from a particular task.
One thing I like about LangGraph is the declarative state merging. In the MapReduce example, how do you guarantee that the collector.append() operation is thread-safe?
Thank you! Good question. Behind the scenes, state is manage via Semaphore. So each call to .get(...) acquires a semaphore with a single permit so it essentially guarantees thread-safety. Check out how the state is implemented here https://github.com/lmnr-ai/flow/blob/main/src/lmnr_flow/stat...
thanks for sharing, looks very interesting! Although, it seems to be very high level and hides all the details behind prompt string. I tend to dislike that kind of things. I designed Flow to be as bare bone as possible and give all control to the user
Congrats on launching! From your explanation comparing to traditional workflow engines it would be a step up from something like Airflow, but I was wondering about a comparison to Temporal which does let you create dynamic DAGs as well as provides some additional guarantees around reliability of workers in case of crashes.
Thanks! Yep, retry and durability guarantee is on the roadmap for sure, I also wanted to start working on the serverless offering. What kind of workflows are you building with temporal?
I'm working on building out an AI agent right now - looking through autogen, langgraph and other current frameworks along with just building my own logic on top of a workflow orchestrator.
The ReAct paradigm for agents makes me think of standard workflows because they're not that different conceptually: Read the state of the world, plan / "reason" based on user intent, perform actions that mutate, then go back and read the state of the world if problem hasn't been fully solved (ie start another workflow). Similar in concept from reading from dbs, computing logic, mutating / writing to new table, trigger follow on job from data engineering.
Hey, I’m building agents on top of temporal as well. One of the main limitations is child workflows can not spawn other child workflows. Are you doing an activity for every prompt execution and passing those through other activities? Or something more framework-y?
Right now I'm doing prompt execution in activities, passing results to other activities.
Workflows currently only started after human events, but going to move towards additional workflows started through an activity soon - I'm keeping the state of the world independent of the execution engine so each workflow will read the new state.
Prefect is also a non-DAG based workflow engine that has very similar functionality to what is outlined here; they have built a few tools in the AI space (marvin and controlflow iirc)
I did see the part about "graph is wrong," but the examples in the readme seem like you may find this interesting: https://news.ycombinator.com/item?id=42274399(Nodezator is a generalist Python node editor; 29 Nov 2024; 78 comments)
True, graph is not wrong per se, but in my experience I saw a lot of problems, the moment it starts to be a little more dynamic. Main problem is predefined edges. That's where you bump into things like conditional edges, cycles edges and so on. That's also my problem with frameworks like LangGraph. They just get in your way by pushing on things which looked like a good idea in the beginning.
Thanks for sharing. Can you add a simple example that is more concrete and less abstrwct to the docs? It would be easier for people to understand if you did.
Thank you for building this! It looks excellent and geared at exactly the same problems I've been facing. In fact, I've been working on a very similar package and this may have just saved me a ton of time. Excited to give it a try!
Thank you for the kind words! Feel free to join our discord https://discord.gg/nNFUUDAKub and discuss stuff there. Also ping me there any time, always happy to help you onboard!
I agree complex conditional LangGraph setups get pretty tedious after a certain point - though you claim to not use graphs here, but isn't that essentially what returning the "next task" does? The graph isn't explicitly defined but it still exists implicitly if you trace through all the tasks and next tasks.
Would be interesting to see a complex agent implementation in both Flow and regular LangGraph to compare maintainability.
In some sense yes, task dependencies do form a graph. However, what I meant when I was saying that graph is a wrong abstraction, is that predefined edges is a not the best of writing code when dealing with dynamic systems such as AI Agents.
As you said, the easiest conditional workflows are tedious and we have to rely on "conditional" edges to solve the simplest stuff. Then, how would you define cycles or spawning of the same task multiple times as a node-edge relation? It becomes extremely inconvenient and just gets in your way. The reason I built Flow is exactly because I previously relied on a predefined node-edge system which I also built (https://docs.lmnr.ai/pipeline/introduction).
Also it impossible to do MapReduce like operations on node-edge without inventing some special node for handling that stuff.
Idea about comparing LangGraph and Flow is really good and I will work on that asap! What kind of workflows would you love to see implemented in Flow? Btw, there're a lot of examples in the readme and in the /tests folder
we're working on a way to make these conditional flows less tedious :) some good devX improvements coming soon!
Sometimes it reminds me of the evolution of process management software.
What might be new to those developing tooling with AI, might not be new to other areas of software.
Interesting! I feel like this is a cross between https://github.com/dagworks-inc/burr (switch state for context) and https://github.com/Netflix/metaflow because the output of the "task" declares its next hop...
The challenge with this approach though is that you need to run the actual code to see what it does, or as a developer build up a mental model of the code ... but it does shine in certain use cases -- and also reminds me of https://github.com/insitro/redun because it takes this approach too.
burr looks very interesting!
...also MemGPT (now called Letta) https://github.com/letta-ai/letta
I tend to agree the graph approach is wrong.
I have a meta question though - there seems to be a huge amount of activity in this space of LLM agent related developer tooling. Are people actually successfully and reliably delivering services which use LLMs? (Meaning services which are not just themselves exposing LLMs or chat style interfaces).
It seems like 18 months ago people were running around terrified, but now almost the opposite.
I once had a dream to be a writer. I made a tool that was going to help me write every day. The most insightful feedback I got was: "Wow, aspiring writers will do anything to avoid writing!"
I think something similar applies here: you see way more dev tools than successful products because people can't build many successful products using LLMs. Building a devtool (the old-fashioned way) is something that people can tangibly do.
My main company is Laminar (https://www.lmnr.ai) and we actually help folks ship reliable LLM software to prod. We've seen many of our clients successfully do that. Although, the entire space is very new for sure and things are changing everyday
What kind of stuff are people doing? I understand if you don't want to be too specific, but do you have any interesting (vague) examples using agents that aren't in the chat bot space?
- AI auditor of Otel traces - wealth manager advisor - AI data engineer to name most interesting cases without giving too much details
but also many chat bots and assistants too
Folks are still getting funded for these sorts of ideas (LLM-based agents) but until I see a single working proof of concept, I'm convinced they're all doomed.
To not fall under ,what is criticized by some people here, as making an ad for a competitor - but big parts of it are very similar to a project im working on for multiple years now, i not gonne name or link the project i gonne mention.
While my project is not meant to be specifically used with LLMs, what i build (or am building) is a system which has no specific defined sequence, but is rather a composition of "actions" of which each of them has a specific defined requirement in form of a data structure that is necessary to execute the "action". I build it in a way to be self supervising without one single task scheduling mechanism to enable people to write data driven applications.
Its nice to see that im not the only one (ye i didnt expect that to be the case dont worry im not elon-musk crazy) that tries to go such a way.
While i build it for completly different use cases (when i started LLM weren't such a big thing as they are now) its definatly a cool and creative way to use such an architecture.
Gl hf :) and thumbs up
PSA: It’s really rude when someone does a show HN to go and plug your own competitor. Let them have their moment
I kinda disagree. This is a discussion forum first, advertising should just be a side effect. Anyone posting here should do with an intention to spark discussions and elicit greater wisdom of the community. And if that wisdom happens to be comparisons to prior art, so be it. I don't think it should take away anything from the work, but only expand both the author's as well as the community's awareness on the topic.
Noted. I had no intention to be disrespectful in any sense. Just pointing out flaws and reasons why and how current project came to be.
Not talking about you!
oh, got it :)
What are you referring to?
Neat ideas here. I've listed 3 thoughts/concerns:
To avoid deadlocks, it seems like the executor should need to know what the dependencies are before attempting execute tasks. If we have 4 threads of execution, it seems like we could get into a state where all 4 of our threads are blocking on semaphores waiting for another task to provide a dependent value. And at scale, if it can happen, it definitely will happen eventually.Potentially related-- it could make sense for the engine to give preference to completing partially completed tasks before starting new fresh tasks.
Also, I wonder if there's a way to lay the specification of tasks out so it looks more like normal code-- potentially with async await. What I mean by normal code is this: It's much more natural to write a program with 2 steps like this:
Than to redo that in a task way like this If I have 7 steps with some loops and conditionals, it becomes much more difficult to grok the structure with the disjointed task structure, and much more difficult to restructure and reorganize it. I wonder if there's a way to pull it off where the distributed task structure is still there but the code feels more natural. Using this framework we're essentially writing the code in a fragmented way that's challenging to reorganize and reason about.What will happen when we change the task code and deploy a new version; I wonder what happens to the tasks that were in flight at the moment we deploy?
Thank you for your comment and wanted to add some clarifications.
1. tasks are not explicitly called from another task In your example greet() is never called, instead task with id=greet will be pushed to the queue
2. The reason I opted for distributed task approach is precisely to eliminate await task_1 await task_2 ...
Going to the point 1, task_2 just says to the engine, ok buddy, now it is time to spawn task_2. With that semantics we isolate tasks and don't deal with the outer tasks which calls another tasks. Also, parallel task execution is extremely simply with that approach.
3. Deadlocks will happen iff you will wait for the data that is never assigned, which is expected. Otherwise, with the design of state and engine itself, they will never happen.
https://github.com/lmnr-ai/flow/blob/main/src/lmnr_flow/stat...
https://github.com/lmnr-ai/flow/blob/main/src/lmnr_flow/flow...
4. For your last point, I would argue the opposite is true, it's actually much harder to maintain and add new changes when you hardcode everything, hence why this project exists in the first place.
5. Regarding deployment. Flow is not a temporal-like (yet), everything is in-memory and but I will def look into how to make it more robust
Why limit yourself to AI and agents? seems abstract enough that tasks, and parallel execution could be anything
Moreover, there isn't an example how it `could` work for inference or function_calls/tools/agents
But looks simple to get started and feels like it has the right foundation for good things to come. Kudos!
Agreed. It seems like this has nothing to do with agents. In particular because this is well trodden territory that frameworks like Airflow have been tackling since _well_ before the recent deep learning craze.
I would say that unless agent based examples are highly compelling, it makes more sense to simply remove the agent stuff completely from the pitch, lest you be inevitably accused of taking advantage of the current AI hype for an otherwise unrelated piece of technology (that happens to be a good fit for agent based workflows - something which I haven't observed to even work very well with the best models).
For me Agent is essentially a decision making machine. I had many iterations on the software around building agents and Flow is the culmination of all of the learnings.
For some reason some LLM specific examples just slipped of my mind because I really wanted to show the barebone nature of this engine and how powerful it is despite its simplicity.
But you also right, it's general enough that you can build any task based system or rebuild complex system with task architecture with Flow.
Signal is clear, add more agent specific examples.
I did something similar to this: https://github.com/memodb-io/drive-flow.
It supports tasks, dynamic routes and parallel execution in pure Python build-ins(zero deps). But just a side project so no persistent stuff and just a easy tool.
Totally makes, don't know why I missed agent specific examples... will add them asap.
I focused on the Agent because it was initially built for that internally. But you are right, for the release I made it extremely general and to be used as a foundation for the complex systems.
Nice approach!
I'm intrigued by the ability to start execution from a particular task.
One thing I like about LangGraph is the declarative state merging. In the MapReduce example, how do you guarantee that the collector.append() operation is thread-safe?
Thank you! Good question. Behind the scenes, state is manage via Semaphore. So each call to .get(...) acquires a semaphore with a single permit so it essentially guarantees thread-safety. Check out how the state is implemented here https://github.com/lmnr-ai/flow/blob/main/src/lmnr_flow/stat...
Also curious, what kind of agents are you building with LangGraph. Would be more than happy to help you onboard to Flow!
Feels like you'd be a fan of - https://github.com/PrefectHQ/prefect - https://github.com/PrefectHQ/ControlFlow
Motivated by more or less the same frustration you've laid out here.
thanks for sharing, looks very interesting! Although, it seems to be very high level and hides all the details behind prompt string. I tend to dislike that kind of things. I designed Flow to be as bare bone as possible and give all control to the user
Congrats on launching! From your explanation comparing to traditional workflow engines it would be a step up from something like Airflow, but I was wondering about a comparison to Temporal which does let you create dynamic DAGs as well as provides some additional guarantees around reliability of workers in case of crashes.
Thanks! Yep, retry and durability guarantee is on the roadmap for sure, I also wanted to start working on the serverless offering. What kind of workflows are you building with temporal?
I'm working on building out an AI agent right now - looking through autogen, langgraph and other current frameworks along with just building my own logic on top of a workflow orchestrator.
The ReAct paradigm for agents makes me think of standard workflows because they're not that different conceptually: Read the state of the world, plan / "reason" based on user intent, perform actions that mutate, then go back and read the state of the world if problem hasn't been fully solved (ie start another workflow). Similar in concept from reading from dbs, computing logic, mutating / writing to new table, trigger follow on job from data engineering.
Totally agree, I encourage you to give Flow a try! Would be more than happy to help you onboard. Also would love to hear if anything is missing.
Hey, I’m building agents on top of temporal as well. One of the main limitations is child workflows can not spawn other child workflows. Are you doing an activity for every prompt execution and passing those through other activities? Or something more framework-y?
Right now I'm doing prompt execution in activities, passing results to other activities.
Workflows currently only started after human events, but going to move towards additional workflows started through an activity soon - I'm keeping the state of the world independent of the execution engine so each workflow will read the new state.
Have you hit any non-determinism errors keeping workflow state outside temporal?
Prefect is also a non-DAG based workflow engine that has very similar functionality to what is outlined here; they have built a few tools in the AI space (marvin and controlflow iirc)
I did see the part about "graph is wrong," but the examples in the readme seem like you may find this interesting: https://news.ycombinator.com/item?id=42274399 (Nodezator is a generalist Python node editor; 29 Nov 2024; 78 comments)
True, graph is not wrong per se, but in my experience I saw a lot of problems, the moment it starts to be a little more dynamic. Main problem is predefined edges. That's where you bump into things like conditional edges, cycles edges and so on. That's also my problem with frameworks like LangGraph. They just get in your way by pushing on things which looked like a good idea in the beginning.
Thanks for sharing. Can you add a simple example that is more concrete and less abstrwct to the docs? It would be easier for people to understand if you did.
yep, here's one https://github.com/lmnr-ai/flow?tab=readme-ov-file#llm-agent...
Thanks!
Thank you for building this! It looks excellent and geared at exactly the same problems I've been facing. In fact, I've been working on a very similar package and this may have just saved me a ton of time. Excited to give it a try!
Thank you for the kind words! Feel free to join our discord https://discord.gg/nNFUUDAKub and discuss stuff there. Also ping me there any time, always happy to help you onboard!
you have some hypothetical examples but a more real world setup would be cool
here's a good one https://github.com/lmnr-ai/flow?tab=readme-ov-file#llm-agent...