is_terminal causes two strange symptoms of forge, which will show under qemu-user and nix-on-droid
by copilot + gpt5.4
Forge aarch64 freeze under qemu-user
Root cause
Forge uses stdin for two different purposes:
- Piped input ingestion at startup: in
crates/forge_main/src/main.rs, Forge checksstdin.is_terminal(). If that isfalse, it callsread_to_string()on stdin and stores the result ascli.piped_input. - Interactive terminal input later: the REPL and prompt widgets use
reedline/rustylineto read from the terminal.
The freeze happens in the startup piped-input path, not in the chat request itself.
Under the aarch64 binary running through qemu-user + binfmt, stdin appears to be misdetected:
- Forge sees
stdin.is_terminal() == false - Forge therefore assumes stdin is piped input
- Forge calls
read_to_string()on fd 0 read_to_string()waits for EOF- on an interactive TTY, EOF never arrives unless the user explicitly sends it
So Forge appears to freeze before printing anything meaningful.
This explains the observed behavior:
forgefreezes because it tries to drain interactive stdinforge --versionworks because Clap exits before Forge reaches that logicecho | forgeworks differently because stdin reaches EOF immediatelyforge -p hellostill freezes because the stdin-drain logic runs before the prompt is handledecho | forge -p helloworks because stdin is closed immediately, so startup continues
Detailed explanation of the fix
The current logic is too coarse:
#![allow(unused)] fn main() { if !stdin.is_terminal() { read_all_stdin(); } }
That treats all non-terminal stdin sources as equivalent, but they are not:
| stdin source | is_terminal() | should Forge drain it? |
|---|---|---|
| real tty / pty | usually true, but false in this broken case | no |
pipe (echo hi | forge) | false | yes |
file (forge < prompt.txt) | false | yes |
| socket | false | yes |
| character device | false in some edge cases | usually no |
The fix is to distinguish real streams from generic character devices.
Proposed Unix logic
- If
stdin.is_terminal()istrue, treat stdin as interactive and do not drain it. - If
stdin.is_terminal()isfalse, callfstat(0). - Inspect
st_mode & S_IFMT. - Only call
read_to_string()when stdin is notS_IFCHR.
That means:
S_IFIFO(pipe) -> readS_IFREG(file) -> readS_IFSOCK(socket) -> readS_IFCHR(character device, including tty-like fds) -> do not read
Why this works better
is_terminal() is a behavioral test. In the qemu-user setup it appears unreliable for this aarch64 process.
fstat(0) provides a structural check from the kernel: it tells Forge what kind of fd stdin actually is.
Even if is_terminal() incorrectly returns false, a tty / pty still shows up as a character device (S_IFCHR), which lets Forge avoid draining it as if it were a pipe.
So the new rule is:
Only auto-consume stdin when stdin is a real stream source for content injection, not when it is merely a character device.
Behavior preserved by the fix
echo hello | forgestill worksforge < prompt.txtstill works- socket-fed stdin still works
- interactive
forgeno longer blocks in startup stdin draining forge -p hellono longer blocks in startup stdin draining
Important limitation
This fix specifically addresses the startup freeze in main.rs.
If qemu-user also breaks later terminal behavior inside reedline / rustyline, the fully interactive REPL may still have separate issues. In other words:
- startup hang: this fix should address it
- interactive line editing problems after startup: may require additional debugging
Implementation sketch
Conceptually:
#![allow(unused)] fn main() { enum StdinKind { Terminal, CharacterDevice, Stream, } fn should_read_piped_input(kind: StdinKind) -> bool { matches!(kind, StdinKind::Stream) } }
Unix detection:
#![allow(unused)] fn main() { if stdin.is_terminal() { StdinKind::Terminal } else { match fstat(0) { S_IFCHR => StdinKind::CharacterDevice, _ => StdinKind::Stream, } } }
And then:
#![allow(unused)] fn main() { if should_read_piped_input(detect_stdin_kind()?) { stdin.read_to_string(...) } }
Tradeoff
/dev/null is also S_IFCHR, so Forge would no longer eagerly drain it.
That is acceptable here because draining /dev/null only produces empty input anyway, and the important goal is to stop misclassified TTYs from blocking forever.
forge cmd execute model does not show
After removing the startup is_terminal()-based stdin drain, Forge can get past initialization, but forge cmd execute model still does not behave correctly under the same qemu-user setup.
Observed behavior
forge list model --porcelainworks and prints the available modelsforge config get providerandforge config get modelshow valid session configurationforge cmd execute modelprints theInitialize ...line and then exits with code0- no model picker is shown
straceshows thatfzfis never launched
This means the command is not hanging while loading models. Instead, the interactive selection path returns early as if the user cancelled the picker.
Command path
The command flow is:
forge cmd execute modelTopLevelCommand::CmdCmdCommand::Execute- command string becomes
/model /modelmaps toAppCommand::ModelAppCommand::Modelcallsself.on_model_selection(None).await?on_model_selection()callsselect_model()select_model()builds the list of models successfully and then calls the interactive selector
The model data path itself is working because self.api.get_all_provider_models().await? succeeds and forge list model --porcelain also succeeds.
Where it fails
The failure is in the shared interactive widget layer, not in model loading.
select_model() eventually calls:
#![allow(unused)] fn main() { ForgeWidget::select("Model", rows) .with_starting_cursor(starting_cursor) .with_header_lines(1) .prompt()? }
But forge_select::SelectBuilder::prompt() still contains:
#![allow(unused)] fn main() { if !std::io::stdin().is_terminal() { return Ok(None); } }
Under qemu-user, stdin is again being misdetected as non-terminal, so the selector returns Ok(None) immediately.
That None propagates upward as:
- “no selection”
- treated like “user canceled”
- no error is printed
- process exits successfully
This exactly matches the observed behavior.
Why fzf never appears
Because the code bails out before trying to launch fzf.
So this is not:
- a model API failure
- an empty model list
- an
fzfstartup failure
It is the same TTY misdetection problem, but now inside forge_select.
Affected widget paths
The same pattern still exists in:
crates/forge_select/src/select.rscrates/forge_select/src/multi.rscrates/forge_select/src/input.rs
So even if the startup stdin logic in main.rs is fixed, interactive Forge widgets can still silently fail under qemu-user until these TTY guards are updated too.
Correct fix direction
The same stronger stdin classification used for the startup fix should also be applied to the widget layer.
Instead of:
#![allow(unused)] fn main() { if !std::io::stdin().is_terminal() { return Ok(None); } }
use a shared helper that:
- returns interactive if
stdin.is_terminal()is true - otherwise uses
fstat(0) - treats
S_IFCHRas interactive - only treats true stream inputs (pipe/file/socket) as non-interactive
Conceptually:
#![allow(unused)] fn main() { if !stdin_is_interactive()? { return Ok(None); } }
Summary
The original startup freeze and the cmd execute model problem have the same root family:
- qemu-user misreports a tty-like stdin as non-terminal
But they hit different code paths:
- startup freeze:
crates/forge_main/src/main.rs - missing model picker:
crates/forge_select/src/select.rs
So fixing only main.rs is not enough. The interactive selection/input widgets must also stop relying on raw stdin.is_terminal() checks.
How the root causes were located
Symptom 1: Forge aarch64 freeze
The debugging process for the startup freeze followed a narrowing sequence:
-
First compare multiple entry modes:
forgefreezesforge --versionworksforge -p hellofreezesecho | forgebehaves differentlyecho | forge -p helloworks
This immediately suggested that the problem depends on how stdin is attached, not on the core model API or binary loading.
-
Then inspect the startup code for stdin handling.
crates/forge_main/src/main.rscontains an early branch:#![allow(unused)] fn main() { if !std::io::stdin().is_terminal() { std::io::stdin().read_to_string(&mut stdin_content)?; } }That code runs before the UI starts, so it was a strong candidate for a startup freeze.
-
Next use
straceon the failing case.For
forge -p hello, the traced process did not produce meaningful output and blocked on a read from fd 0. That matched the behavior ofread_to_string(), which waits for EOF. -
Then compare against the piped case.
In the piped case, stdin immediately reaches EOF, so the same startup branch completes and Forge continues. That explained why
echo | forge -p helloworks whileforge -p hellofreezes. -
Finally check the code paths that should have run later.
The later interactive UI and prompt code could not explain why the process stalled before doing useful output. The early stdin-drain branch explained the full symptom matrix much better.
That is how the investigation converged on:
stdin.is_terminal()is wrong under qemu-user for this aarch64 process- Forge incorrectly enters the piped-input code path
read_to_string()blocks forever on interactive stdin
Symptom 2: forge cmd execute model shows nothing
The debugging process for the missing model picker followed a similar elimination path:
-
First reproduce the behavior carefully.
The command:
forge cmd execute modelprinted only the
Initialize ...line and then exited successfully with no picker shown. -
Then verify that model data itself is available.
These checks worked:
forge config get providerforge config get modelforge list model --porcelain
So the problem was clearly not “no provider configured” or “failed to load models”.
-
Next trace the command path in the source.
The flow was:
TopLevelCommand::CmdCmdCommand::Execute- parsed into
/model - mapped to
AppCommand::Model - which calls
on_model_selection(None) - which calls
select_model() - which builds model rows and then invokes
ForgeWidget::select(...).prompt()?
This showed that model loading and model-row construction both happen before the interactive widget.
-
Then use
straceto check whetherfzfis actually launched.It was not. No
execve()offzfappeared.That ruled out:
- an
fzfrendering problem - a blocked
fzfterminal session - a failure after launching the picker
- an
-
Then inspect the shared widget implementation.
forge_select::SelectBuilder::prompt()still had:#![allow(unused)] fn main() { if !std::io::stdin().is_terminal() { return Ok(None); } }So if qemu-user misreports stdin again, the widget simply returns
None, which the caller interprets as “user canceled”. -
Finally compare this behavior with the observed result.
That exact code path explains:
- no picker appears
- no error is shown
- exit code is 0
fzfis never executed
That is how the investigation converged on the second root cause:
- the startup stdin bug was only one instance
- the same
is_terminal()misdetection still exists insideforge_select cmd execute modelfails silently because the selector bails out before openingfzf
General lesson from both symptoms
The important debugging pattern was:
- compare working and failing invocation forms
- identify whether the difference is stdin/TTY-related
- inspect the earliest code that branches on stdin
- confirm with
stracewhether the process blocks, exits, or launches child tools - only then map the runtime behavior back to the exact source location
Both symptoms looked different on the surface:
- one freezes
- one exits quietly
But the investigation showed they are both consequences of the same underlying issue:
stdin.is_terminal()is unreliable for this qemu-user aarch64 environment