Large language models in national security applications

(arxiv.org)

88 points | by bindidwodtj a day ago ago

56 comments

Jerrrrrrry a day ago

If the probability beats human error margin in regards to collateral damage, then sure.

That was the sentiment in regards to Level 5 automaton driven vehicles.

I see no logical difference, only human sentiment ones.

[-]

randcraw 11 hours ago

Presuming human and LLM error to be equivalent assumes that the risk of committing errors by an LLM has the same distribution that human errors do. But they don't. LLMs make thunderously insane errors in ways that no human would do -- like casually revealing top secret info that no human would do, or inventing nonsense that no human would.

Until we can get LLMs to fail more predictably, we have no business entrusting them to any sensitive data. And that applies to their use in non-military spaces like medicine and sensitive personal data of all kinds. Rushing to hand LLMs the keys to the kingdom is the exact opposite of intelligent.

[-]

Jerrrrrrry 9 hours ago

100% agree.

The LLM's have a specific use purpose in an amalgamation of architecture that we will find will likely converge on something akin to a brain: modules that when collectively used, give immediate rise to much-appreciated conscientiousness, amongst other illusions.

Limiting these models before they can themselves give emergence to give further emergence to enable negative externalizes is the job.

LLM's have the "uncanny" property of approximating, whether it be "illusory ", and whether that illusion is deceitful, or impossible to delineate, and whether those who job it is have reason to lie, is all part of the fukin fun.

sic hunt dei!

joe_the_user a day ago

The problem you have is there's no way to estimate probability in situations like warfare or similar chaotic environments.

[-]

Jerrrrrrry a day ago

Sure you do, it's accumulated heuristics, no different than meteorology, or other macro-sims of chaotic systems.

The difference is that human lives are intentioned for different fates; so the negative cognitive dissonance is going to persist consciously, then sub-consciously.

[-]

joe_the_user a day ago

it's accumulated heuristics, no different than meteorology

Meteorology is based on physics, meteorology doesn't have a hostile agent attempt counter prediction attempts, meteorology doesn't involve a constantly changing technological landscape, meteorology has access to vast amounts data whereas data that's key to military decisions is generally scarce - you know the phrase "fog of war"?

I mean, LLMs in fact, don't provide probabilities for their predictions and indeed the advance of deep learning has hinge "just predict, ignore all considerations of 'good statistics' (knowing probabilities, estimating bias)".

[-]

Jerrrrrrry a day ago

  >Meteorology is based on physics,

Good start.

  >meteorology doesn't have a hostile agent attempt counter prediction attempts

All chaotic systems have second-order reinforcing feedback loops by "tautological" definition of what a feedback loop is and what a sufficiently-complex system is.

  >meteorology doesn't involve a constantly changing technological landscape,

it does, it just has little incentive/motive/order to change, because knowing the probability of it going to rain vs. the cost of it changing the weather is so uneconomically comically juxtapositioned that technology wasn't advanced enough make any progress.

All phones being 5G'd into the weather apparatus would give certainty, but the pressure to do that collectively is not there.

   >meteorology has access to vast amounts data whereas data that's key to military decisions is generally scarce - you know the phrase "fog of war"?

"Fog of war" is how it is perceived from outside. Inside, it is those who have a monopoly on viol/power/surv that ultimately wins.

and that is us.

[-]

parodysbird 13 hours ago

Warfare is not a chaotic system. We don't think outcomes are highly sensitive to marginal tweaks in initial conditions of the model. Hostile actor aren't modelled as chaotic systems but as agents in some game theory model. None of these agents has a monopoly on violence or power or information, or else it wouldn't be warfare.

[-]

Jerrrrrrry 9 hours ago

That's exactly why the last "war" was World War 2.

Since then, we have higher level ordered gentleman agreements that prevent it, as gentleman's agreements are actually more beneficial to the collective than absolute, codified ones, such as "war" and "congress" and "laws" and "special military operations..."

robertkoss a day ago

photochemsyn a day ago

It's rather conspicuous that the most well-known use of AI systems in warfare at present, the Lavender / Gospel / Where's Daddy systems used by the IDF, don't get any mention. It's true that LLMs are not the central component of these systems, which have much more in common with Google's targeted ad serving algorithms, in the broader category of machine learning, but a no-code LLM interface is a likely component.

In defensive red team scenarios, such an LLM system could be used in all kinds of nefarious ways, using prompts like "provide a list of everyone associated with the US nuclear weapons program, including their immediate friend and family circles, and ranking them by vulnerability to blackmail based on their personal web browsing history" and so on.

[-]

immibis a day ago

We're not allowed to talk about Israel.

perihelions a day ago

The most obvious way the US national security industry could use LLM's right now is simply to spam foreign adversaries with chatbots. That's their greatest strength right now—a use-case they have amply proven themselves for.

This paper comes off as eager to avoid this topic: they (briefly) talk about detecting foreign LLM spam, which is called propaganda, but sidestep the idea of our own side using it. If we were considering talking about that, we wouldn't choose negative-sentiment descriptors like (quoting the paper) "nation-state sponsored propaganda", or "disinformation campaigns"; we'd use our own netural-sentiment jargon, which is "psychological operations" ("psyops") [0].

That we're not widely debating this question right now *probably* means it's far too late to have a chance of stopping it.

edit: Or, to rephrase this as a question: Is it ethical to spam another democracy with millions of chatbots pretending to be citizens of that country—if the effect is to manipulate those citizens to not go to war with our own side, saving our own lives? Is that an atrocity or is that legitimate warfare?

[0] https://en.wikipedia.org/wiki/Psychological_operations_(Unit...

[-]

jingyibo123 a day ago

I'm pretty sure that before the era of LLM, hired personnel and computer chatbots already exists on major social media platform to serve as "opinion influencers", either for political or commercial purposes. Just that chatbot algorithms are kid of naive, and have been "fished" on several occasions....

The usage of LLM is too obvious, and way cheaper than hired personnel, so it's like "prism". They don't want us to talk about it because it's been going on for so long...

[-]

immibis a day ago

They have, and it scales orders of magnitude better with LLMs.

Onavo a day ago

The platform you are referring to is called "Reddit", one of YC's portfolio companies.

PittleyDunkin 20 hours ago

I'm sure they're already used against both foreign and domestic populations.

joe_the_user a day ago

Oh, national security professionals aren't going to be talking about psyops, offensive applications and etc, because such things make a given state look bad - they're an offense against democracy and respect for facts and they make the overt media of a given nation look bad. But hey, leave it to HN commentators to root for taking the gloves off. Not to worry post, I'd bet dollars to donuts the actual classified discussions of such things aren't worried about such niceties. But even more, in those activities of the US and other secret states, that have come to light, these state have propagandized not only enemy populations but also their own. After, have to counter enemies trying to nefariously prevent wars as well.

[-]

a day ago

[deleted]

a day ago

[deleted]

davisp a day ago

[flagged]

bilbo0s a day ago

Or, to rephrase this as a question: Is it ethical to spam another democracy with millions of chatbots pretending to be citizens of that country—if the effect is to manipulate those citizens to [take any action advantageous to US national interest]...?

Just, Devil's Advocate, but ethical or not, that's what we should be doing and what we are doing. Every nation has its sock puppets out there, our job is to stop everyone else' sock puppets, and do everything we can to extend the reach of our own sock puppets.

[-]

JoshTriplett a day ago

That's not inherently true. If there were a way to reliably destroy all the sock puppets, we should, and the world would be better off. For instance, reliable bot detection, or mechanisms by which major social networks could detect and prohibit bot-like activity that isn't labeled as such.

[-]

exe34 a day ago

charge one penny for every post. most people can afford it. bots become less cost effective and you'd be able to trace the source of funds.

[-]

alisonatwork a day ago

This is a nice idea, but we already know that organized criminals and state actors have no problem spending money to make money (or influence public opinion).

On top of that, social media isn't like email where a potential transaction fee increases the cost per person reached - a single post on a hot topic could get millions of views.

recursive a day ago

Even if they could afford it, they won't. The UX friction of charging money would send engagement off a cliff, even if the nominal charge was $0.

[-]

hatthew a day ago

Only half joking, I feel like the majority of comments on the internet are garbage and not worth the time to read, and increasing friction with a nominal charge isn't necessarily a terrible thing.

[-]

recursive a day ago

It's a good thing for most people, just not the "platforms" trying to show growth.

exe34 a day ago

that's why it should be mandated by law for national security reasons!

itishappy a day ago

Why stop there? Send missiles! Heck, send bioweapons and dirty bombs. It might not be ethical but if others doing it we might as well join 'em.

(This is sarcasm. I don't want to live in this world.)

[-]

immibis a day ago

We do.

idle_zealot a day ago

I don't think you understand the Devil's Advocate rhetorical device. It's not just a thing you say before you state a morally reprehensible opinion you hold.

[-]

Sabinus a day ago

Imo there's a difference between propaganda promoting functioning democracy and propaganda promoting conservative-at-best authoritarianism.

htrp a day ago

This is how you get CIAGPT

[-]

kevmo a day ago

PRISMGPT

[-]

Jerrrrrrry a day ago

https://en.wikipedia.org/wiki/Singleton_(global_governance)

hammock a day ago

Palantir

ofslidingfeet a day ago

Alternative title: "Obviously Irresponsible, Intellectually Lazy Things that We Definitely Haven't Been Doing for Fifteen Years."

Syonyk a day ago

Abstract:

> The overwhelming success of GPT-4 in early 2023 highlighted the transformative potential of large language models (LLMs) across various sectors, including national security. This article explores the implications of LLM integration within national security contexts, analyzing their potential to revolutionize information processing, decision-making, and operational efficiency. Whereas LLMs offer substantial benefits, such as automating tasks and enhancing data analysis, they also pose significant risks, including hallucinations, data privacy concerns, and vulnerability to adversarial attacks. Through their coupling with decision-theoretic principles and Bayesian reasoning, LLMs can significantly improve decision-making processes within national security organizations. Namely, LLMs can facilitate the transition from data to actionable decisions, enabling decision-makers to quickly receive and distill available information with less manpower. Current applications within the US Department of Defense and beyond are explored, e.g., the USAF's use of LLMs for wargaming and automatic summarization, that illustrate their potential to streamline operations and support decision-making. However, these applications necessitate rigorous safeguards to ensure accuracy and reliability. The broader implications of LLM integration extend to strategic planning, international relations, and the broader geopolitical landscape, with adversarial nations leveraging LLMs for disinformation and cyber operations, emphasizing the need for robust countermeasures. Despite exhibiting "sparks" of artificial general intelligence, LLMs are best suited for supporting roles rather than leading strategic decisions. Their use in training and wargaming can provide valuable insights and personalized learning experiences for military personnel, thereby improving operational readiness.

I mean, I'm glad they suggest that LLMs be used in "supporting roles rather than leading strategic decisions," but... no? Let's please not go down this route for international politics and national security. "Twitch Plays CIA" and "Reddit Plays International Geopolitical Negotiations" sound like bad movies, let's not make them our new reality...

[-]

a day ago

[deleted]

gmaster1440 a day ago

The paper argues against using LLMs for military strategy, claiming "no textbook contains the right answers" and strategy can't be learned from text alone (the "Virtual Clausewitz" Problem). But this seems to underestimate LLMs' demonstrated ability to reason through novel situations. Rather than just pattern-matching historical examples, modern LLMs can synthesize insights across domains, identify non-obvious patterns, and generate novel strategic approaches. The real question isn't whether perfect answers exist in training data, but whether LLMs can engage in effective strategic reasoning—which increasingly appears to be the case, especially with reasoning models like o1.

[-]

ben_w a day ago

LLMs can combine cross-domain insights, but the insights they have — that I've seen them have in the models I've used — are around the level of a second year university student.

I would concur with what the abstract says: incredibly valuable (IMO the breadth of easily discoverable knowledge is a huge plus all by itself), but don't put them in charge.

[-]

gmaster1440 a day ago

The "second year university student" analogy is interesting, but might not fully capture what's unique about LLMs in strategic analysis. Unlike students, LLMs can simultaneously process and synthesize insights from thousands of historical conflicts, military doctrines, and real-time data points without human cognitive limitations or biases.

The paper actually makes a stronger case for using LLMs to enhance rather than replace human strategists - imagine a military commander with instant access to an aide that has deeply analyzed every military campaign in history and can spot relevant patterns. The question isn't about putting LLMs "in charge," but whether we're fully leveraging their unique capabilities for strategic innovation while maintaining human oversight.

[-]

ben_w a day ago

> Unlike students, LLMs can simultaneously process and synthesize insights from thousands of historical conflicts, military doctrines, and real-time data points without human cognitive limitations or biases.

Yes, indeed. Unfortunately (/fortunately depending on who you ask) despite this the actual quality of the output is merely "ok" rather than "fantastic".

If you need an answer immediately on any topic where "second year university student" is good enough, these are amazing tools. I don't have that skill level in, say, Chinese, where I can't tell 你好 (hello) from 泥壕 (mud hole/trench)* but ChatGPT can at least manage mediocre jokes that Google Translate turns back into English:

问: 什么东西越洗越脏？答: 水！

But! My experience with LLM translation is much the same as with LLM code generation or GenAI images: anyone with actual skill in whatever field you're asking for support with, can easily do better than the AI.

It's a fantastic help when you would otherwise have an intern, and that's a lot of things, but it's not the right tool for every job.

* I assume this is grammatically gibberish in Chinese, I'm relying on Google Translate here: https://translate.google.com/?sl=zh-TW&tl=en&text=泥%20壕%20%2...

numpad0 12 hours ago

> Unlike students, LLMs can simultaneously process and synthesize insights from thousands of historical

They can't. Anything multivariate LLMs gloss over and prioritize flow of words over hard facts. Which makes sense considering LLMs are language models, not thinking engines, but that doesn't make them useful for serious(above "second year") intelectual tasks.

They don't have any such unique capabilities, other than that they come free of charge.

[-]

ben_w 3 hours ago

Kinda. Yes they have flaws, absolutely they do.

But it's not a mere coincidence that history contains the substring "story" (nor that in German, both "history" and "story" are "Geschichte") — these are tales of the past, narratives constructed based on evidence (usually), but still narratives.

Language models may well be superhuman at teasing apart the biases that are woven into the minds writing the narratives… At least in principle, though unfortunately RLHF means they're also likely sycophantically adding whatever set of biases they estimate that the user has.

[-]

numpad0 an hour ago

They're subhuman about debiasing or any analytical tasks because they lack reasoning engines that we all have. They pick the most emotionally loaded narrative and go with it.

They can't handle counter-intuitive but absolutely logical cases like how eggplants and potatoes belong to same biological family but not radishes, instead they'll hallucinate and start gaslighting the user. Which might be okay for "second-year" students, but only going to be a root cause of some deadly gotcha in strategic decision-making.

They're language models. It's in the name. They work like one.

psunavy03 a day ago

But the aide won't have deeply analyzed every military campaign in history; it will only spout off answers from books about those campaigns. It will have little to no insight on how to apply principles and lessons learned from similar campaigns in the current problem. Wars are not won by lines on maps. They're not won by cool gear. They're won by psychologically beating down the enemy until they're ready to surrender or open peace negotiations. Can LLMs get in an enemy's head?

[-]

ben_w a day ago

> Can LLMs get in an enemy's head?

That may be much easier for an LLM than all the other things you listed.

Read their socials, write a script that grabs the voices and faces of their loved ones from videos they've shared, synthesise a video call… And yes, they can write the scripts even if they don't have the power to clone voices and faces themselves.

I have no idea what's coming. But this is going to be a wild decade even if nothing new gets invented.

[-]

psunavy03 a day ago

Creating chaos and confusion is great, but it's only part of what a military campaign needs. You have to be able to use all levers of government power to put the other government or the adversary organization in a point where they feel compelled to quit or negotiate.

[-]

ben_w a day ago

Aye.

FWIW, I hope all those other things remain a long way off.

Whoever's doing war game planning needs to consider the possibility of AI that can do those other things, but I'm going to have to just hope.

JohnMakin a day ago

The person you are responding to seems to be promoting a concept that is frequently spouted here and other places, but to me lacking sufficient or any evidence - that AI models, particularly LLMs, are both capable of reasoning (or what we consider reasoning) around problems and generating novel insights that it hasn't been trained on.

fragmede a day ago

Only if the enemy has provided a large corpus of writing and other data to submit to train the LLM on.

beardedwizard a day ago

A language model isn't a model of strategic conflict or reasoning, but may contain text in its training data related to these concepts. I'm unclear why (and it seems the paper agrees) you would use the llm to reason when there are better models for reasoning about the problem domain - and the main value from llm is ability to consume unstructured data to populate the other models.

nyrikki a day ago

You are using a different definition of strategic than the DoD uses, what you are describing is closer to tactical decisions.

They are talking about typically Org wide scope, long-term direction .

They aren't talking about planning hidden as 'strategic planning' in the biz world.

LLMs are powerful, but are by definition past focused, and are still in-context learners.

As they covered, hallucinations, adverse actions, unexplainable models, etc are problematic.

The "novel strategic approaches" is what in this domain would be tactics, not stratagy which is focused on the unknowable or unknown knowable.

They are talking about issues way past methods like circumscription and the ability to determine if a problem can be answered as true or false in a reasonable amount of time.

Here is a recent primer on the complexity of circumscription as it is a bit of a obscure concept.

https://www.arxiv.org/abs/2407.20822

Remember, finding an effective choice function is hard no matter what your problem domain is for non trivial issues, setting a durable shared direction to communicate in the presence of the unknowable future that can't be gamed or predictable by an advisory is even more so.

Researching what mission command is may help understand the nuances that are lost with overloaded terms.

Strategy being distinct from stratagem is also an important distinction in this domain.

[-]

paganel a day ago

> but are by definition past focused,

To add to that, and because the GP had mentioned (a "virtual") Clausewitz, "human"/irl strategy itself has in many cases been too focused on said past and, because of that, has caused defeats for the adopters of those "past-focused" strategies. Look at the Clausewitzian concept of "decisive victory" which was adopted by German WW1 strategists who, in so doing, ended up causing defeat for their country.

Good strategy is an art, the same as war, no LLM nor any other computer code would be ever able to replicate it or improve on it.