Paper: Theoretical Basis for Autonomous Optimization in EmoNAVI v3.6 — Improving Regret Bound via Higher-Order Moment Approximation and Dynamic Distance Estimation — 1. Introduction In deep learning optimization, dynamic adjustment of the learning rate is the most critical factor determining convergence performance. While conventional methods like Adam and AMSGrad utilize the first and second moments of gradients, their ability to directly estimate the steepness (curvature) of the local loss landscape or the distance D to the optimal solution has been limited. This paper demonstrates that the “emotional scalar σt” and “emoDrive” mechanism introduced in EmoNAVI v3.6 function mathematically as an approximation of higher-order moments and an online implementation of D-adaptation (and COCOB theory) (Defazio & Mishchenko, 2023), achieving both extremely low hyperparameter sensitivity and robust convergence 2. Mathematical Redefinition of Implementation and Higher-Order Moment Approximation 2.1 Generating Proxy Indicators Using Multi-EMA EmoNAVI maintains a three-tier exponential moving average (short, medium, long). EMAshort,t​=(1−αs​)EMAshort,t−1​+αs​Lt​ Here, the operation of taking the difference ΔEMA = EMAlong − EMAshort between EMAs with different smoothing coefficients α corresponds to approximating higher-order derivatives of the loss function L over time. Approximation of Third and Fourth Moments: ΔEMA captures the rate of change in gradient (change in curvature). Fifth-order moment history: The emotional scalar σt = tanh(ΔEMA/scale) is a statistic that nonlinearly compresses this higher-order information into the range [−1,1]. By recursively incorporating it into the update formula, the “smoothness” of the long-term terrain is reflected in the parameter update. 3. Dynamic Distance Estimation via emoDrive (D-adaptation) 3.1 Online Approximation for D-Estimation D-adaptation algorithms estimate the optimal distance D from the initial point and scale the learning rate proportionally to D. In EmoNAVI, emoDrive fulfills the role of this D. Acceleration Zone (High Confidence): In regions where σt is stable, the current search direction is deemed correct (lying on the straight path toward the optimal solution w*), and the effective step size is boosted to at least 8 times its original value. This operation is equivalent to exponentially increasing the estimated distance D^. Suppression Zone (Low Confidence): During abrupt changes where ∣σt∣ > 0.75, updates are suppressed at an order of magnitude of O(1−∣σt∣). This serves as a safety mechanism against sudden increases in the local Lipschitz constant Lt​, equivalent to the “reset of the betting amount after a losing streak” in COCOB (Orabona & Tommasi, 2017). The higher-order moments referred to here are: 3rd: skewness 4th: kurtosis 5th: the “variation of variation” in the time dimension ※ Higher-order moments are formed not by a single step but by “temporal integration.” 4. Convergence Proof and Regret Analysis 4.1 Assumptions and Properties L-smoothness: The loss function f has a local Lipschitz constant Lt​, and ∥∇f(w)∥≤G. Boundedness of emoDrive: 00.75 In this region, \mathrm{emoDrive}=\mathrm{coeff}, where \mathrm{coeff}=1.0-|\mathrm{scalar}|. Since |\sigma _t|\in (0.75,1.0), the minimum value B_{\mathrm{low}} satisfies: 0