Abstract
Finite-state dimension, introduced as a finite-state analogue of Hausdorff dimension, quantifies the lower asymptotic density of information in an infinite sequence as perceived by finite-state automata. It admits several equivalent formulations; two particularly useful are via finite-state gambling strategies and via the optimal asymptotic compression ratio achieved by information-lossless finite-state compressors.
Normal sequences represent the highest level of algorithmic randomness visible to finite automata, and are exactly those sequences having finite-state dimension equal to $1$. This motivates a bounded-memory notion of randomness extraction: can a finite-state transducer, reading a single sequence streamingly, extract a normal output from a single input source? More modestly, can it always transform the input into an output of strictly higher finite-state dimension?
Finite-state transducers can perform surprisingly effective one-pass transformations: even with constant memory they can implement variable-length coding schemes including Shannon--Fano coding, remove local redundancy, and increase the apparent randomness rate on many structured or stochastic inputs.
We show randomness extraction using transducers is impossible in a strong, explicit form. For every rational $s \in (0,1)$, we construct a near linear-time computable binary sequence $X$ with $\dim_{\mathrm{FS}}(X)=s$ such that for every finite-state transducer $T$, the output satisfies $\dim_{\mathrm{FS}}(T(X)) \le s$. Thus, for these sequences, finite-state transduction cannot extract normality---indeed it cannot even improve finite-state dimension. Our proof proceeds by a structural analysis of finite-state transducers together with a dimension-preserving diagonal construction that, for each target $s$, builds a sequence whose organization defeats every such transducer's attempt to concentrate randomness. The result is a finite-state analogue of Miller's non-extractability phenomenon for effective dimension, but its proof relies on substantially different techniques, tailored to the finite-state setting.
Furthermore, we show that the impossibility persists even with multiple independent input streams. We treat two notions of independence: (i) Kolmogorov-complexity–based independence (via joint prefix complexity), and (ii) a finite-state notion of relative independence, formulated via relative finite-state dimension. By sharp contrast with the effective-dimension setting—where two independent sources suffice for a uniform effective procedure that boosts randomness rate arbitrarily close to~$1$—we show that finite-state dimension exhibits no comparable multi-source extraction phenomenon. Specifically, for every rational $s\in(0,1)$ and every fixed $k\ge 2$, there exist $k$ independent sources, each of finite-state dimension~$s$, such that for every $k$-input finite-state transducer $T$, the output satisfies $\dim_{\mathrm{FS}}(T(X_1,\dots,X_k))\le s$. Thus, even independent streams do not allow bounded-memory transduction to output a normal sequence or to increase finite-state dimension.