Papers and Presentations

[1] Chun Hei Yip, Rajashree Agrawal, Lawrence Chan, and Jason Gross. Modular addition without black-boxes: Compressing explanations of MLPs that compute numerical integration, December 2024. [ bib | arXiv | https ]

The goal of mechanistic interpretability is discovering simpler, low-rank algorithms implemented by models. While we can compress activations into features, compressing nonlinear feature-maps---like MLP layers---is an open problem. In this work, we present the first case study in rigorously compressing nonlinear feature-maps, which are the leading asymptotic bottleneck to compressing small transformer models. We work in the classic setting of the modular addition models, and target a non-vacuous bound on the behaviour of the ReLU MLP in time linear in the parameter-count of the circuit. To study the ReLU MLP analytically, we use the infinite-width lens, which turns post-activation matrix multiplications into approximate integrals. We discover a novel interpretation of the MLP layer in one-layer transformers implementing the “pizza” algorithm: the MLP can be understood as evaluating a quadrature scheme, where each neuron computes the area of a rectangle under the curve of a trigonometric integral identity. Our code is available at https://tinyurl.com/mod-add-integration.

[2] Jason Gross, Rajashree Agrawal, Thomas Kwa, Euan Ong, Chun Hei Yip, Alex Gibson, Soufiane Noubir, and Lawrence Chan. Compact proofs of model performance via mechanistic interpretability, December 2024. accepted to The Thirty-Eighth Annual Conference on Neural Information Processing Systems. [ bib | DOI | arXiv | publication | presentation (.pdf) | presentation (.pptx) | presentation (Google Slides) | blog (AF) | poster (NeurIPS) | poster (.pdf) | project (GitHub) ]

We propose using mechanistic interpretability -- techniques for reverse engineering model weights into human-interpretable algorithms -- to derive and compactly prove formal guarantees on model performance. We prototype this approach by formally proving accuracy lower bounds for a small transformer trained on Max-of-K, validating proof transferability across 151 random seeds and four values of K. We create 102 different computer-assisted proof strategies and assess their length and tightness of bound on each of our models. Using quantitative metrics, we find that shorter proofs seem to require and provide more mechanistic understanding. Moreover, we find that more faithful mechanistic understanding leads to tighter performance bounds. We confirm these connections by qualitatively examining a subset of our proofs. Finally, we identify compounding structureless errors as a key challenge for using mechanistic interpretability to generate compact proofs on model performance.

[3] Wilson Wu, Louis Jaburi, Jacob Drori, and Jason Gross. Unifying and verifying mechanistic interpretations: A case study with group operations, October 2024. [ bib | DOI | arXiv | https ]

A recent line of work in mechanistic interpretability has focused on reverse-engineering the computation performed by neural networks trained on the binary operation of finite groups. We investigate the internals of one-hidden-layer neural networks trained on this task, revealing previously unidentified structure and producing a more complete description of such models that unifies the explanations of previous works (Chughtai et al., 2023; Stander et al., 2024). Notably, these models approximate equivariance in each input argument. We verify that our explanation applies to a large fraction of networks trained on this task by translating it into a compact proof of model performance, a quantitative evaluation of model understanding. In particular, our explanation yields a guarantee of model accuracy that runs in 30% the time of brute force and gives a >=95% accuracy bound for 45% of the models we trained. We were unable to obtain nontrivial non-vacuous accuracy bounds using only explanations from previous works.

[4] Jason Gross. Short formal proofs of transformers via mechanistic interpretability, August 2024. Presented at ILIAD Conference, Berkeley, California. [ bib | presentation (YouTube) | presentation (Google Slides) ]

It's possible to formally prove theorems about concrete neural nets --- mechanistic understanding is what allows us to make the proofs short enough to be feasible to construct and check while tightly constraining the actual behavior. This framework gives us a quantitative and theoretically grounded metric for measuring the quality of an explanation of neural net behavior. We'll present the results from our initial explorations of this agenda (https://www.alignmentforum.org/posts/bRsKimQcPTX3tNNJZ/compact-proofs-of-model-performance-via-mechanistic), and bridge into discussing exciting future directions and ongoing work in making models more interpretable via fine-tuning on partial explanations.

[5] Chun Hei Yip, Rajashree Agrawal, and Jason Gross. ReLU MLPs can compute numerical integration: Mechanistic interpretation of a non-linear activation, June 2024. accepted to ICML 2024 Workshop on Mechanistic Interpretability. [ bib | poster (.pdf) | code (Colab) | https ]

Extending the analysis from Nanda et al. (2023) and Zhong et al. (2023), we offer an end-to-end interpretation of the 1 layer MLP-only modular addition transformer model with symmetric embeds. We present a clear and mathematically rigorous description of the computation at each layer, in preparation for the proofs-based verification approach as set out in concurrent work under review. In doing so, we present a new interpretation of MLP layers: that they implement quadrature schemes to carry out numerical integration, providing anecdotal and mathematical evidence in support. This overturns the existing idea that neurons in neural networks are merely on-off switches that test for the presence of “features” -- instead multiple neurons can be combined in non-trivial ways to produce continuous quantities.

[6] Jason Gross. Guarantees-driven mechanistic interpretability: Formal proof size as a metric for mechanistic detail of understanding, February 2024. Presented at FAR AI's weekly seminar. [ bib | presentation (revised) (.pdf) | presentation (revised) (.pptx) | presentation (revised) (Google Slides) | presentation (.pdf) | presentation (.pptx) | presentation (Google Slides) | project (GitHub) ]
[7] Jason Gross, Andres Erbsen, Jade Philipoom, Rajashree Agrawal, and Adam Chlipala. Towards a scalable proof engine: A performant prototype rewriting primitive for Coq. Journal of Automated Reasoning, 68(3):19, Aug 2024. [ bib | DOI | arXiv | publication (
Springer
)
 ]

We address the challenges of scaling verification efforts to match the increasing complexity and size of systems. We propose a research agenda aimed at building a performant proof engine by studying the asymptotic performance of proof engines and redesigning their building blocks. As a case study, we explore equational rewriting and introduce a novel prototype proof engine building block for rewriting in Coq, utilizing proof by reflection for enhanced performance. Our prototype implementation can significantly improve the development of verified compilers, as demonstrated in a case study with the Fiat Cryptography toolchain. The resulting extracted command-line compiler is about 1000× faster while featuring simpler compiler-specific proofs. This work lays some foundation for scaling verification efforts and contributes to the broader goal of developing a proof engine with good asymptotic performance, ultimately aimed at enabling the verification of larger and more complex systems.

[8] Jason Gross. MetaCoq quotation: Partial work towards Löb's theorem, October 2023. Presented remotely to the Gallinette team in Nantes at an informal workshop on meta-programming and modal type theories with native quotation operations. [ bib | presentation (.pdf) | presentation (YouTube) | presentation (.pptx) | presentation (Google Slides) | project (GitHub) ]
[9] Joel Kuepper, Andres Erbsen, Jason Gross, Owen Conoly, Chuyue Sun, Samuel Tian, David Wu, Adam Chlipala, Chitchanok Chuengsatiansup, Daniel Genkin, Markus Wagner, and Yuval Yarom. CryptOpt: Verified compilation with random program search for cryptographic primitives. In PLDI'23: Proceedings of the 44th ACM SIGPLAN Conference on Programming Language Design and Implementation, June 2023. Distinguished Paper Award. [ bib | arXiv | http ]

Most software domains rely on compilers to translate high-level code to multiple different machine languages, with performance not too much worse than what developers would have the patience to write directly in assembly language. However, cryptography has been an exception, where many performance-critical routines have been written directly in assembly (sometimes through metaprogramming layers). Some past work has shown how to do formal verification of that assembly, and other work has shown how to generate C code automatically along with formal proof, but with consequent performance penalties vs. the best-known assembly. We present Cryptopt, the first compilation pipeline that specializes high-level cryptographic functional programs into assembly code significantly faster than what GCC or Clang produce, with mechanized proof (in Coq) whose final theorem statement mentions little beyond the input functional program and the operational semantics of x86-64 assembly. On the optimization side, we apply randomized search through the space of assembly programs, with repeated automatic benchmarking on target CPUs. On the formal-verification side, we connect to the Fiat Cryptography framework (which translates functional programs into C-like IR code) and extend it with a new formally verified program-equivalence checker, incorporating a modest subset of known features of SMT solvers and symbolic-execution engines. The overall prototype is quite practical, e.g. producing new fastest-known implementations for the relatively new Intel i9 12G, of finite-field arithmetic for both Curve25519 (part of the TLS standard) and the Bitcoin elliptic curve secp256k1.

[10] Jason Gross and Andres Erbsen. 10 years of superlinear slowness in Coq, August 2022. Presented at The Coq Workshop 2022. [ bib | presentation (.pdf) | presentation (YouTube) | presentation (.pptx) | presentation (.pptx, annotated with notes) | .pdf ]

In most programming languages, asymptotic performance issues can almost always be explained by reference to the algorithm being implemented. At most, the standard asymptotic performance of explicitly used operations on chosen data structures must be considered. Even the constant factors in performance bottlenecks can often be explained without reference to the implementation of the interpreter, compiler, nor underlying machine. In 10+ years of working with Coq, we (Jason, Andres, and our colleagues) have found this pattern, which holds across multiple programming languages, to be the exception rather than the rule in Coq! This turns performant proof engineering, especially performant proof automation engineering, from straightforward science into unpredictable and fragile guesswork. By presenting in detail a sampling of examples, we propose a defense of the thesis: Performance bottlenecks in proof automation almost always result from inefficiencies in parts of the system which are conceptually distant from the theorem being proven. Said another way, debugging, understanding, and fixing performance bottlenecks in automated proofs almost always requires extensive knowledge of the proof engine, and almost never requires any domain-specific knowledge of the theorem being proven. Further, there is no clear direction of improvement: We know of no systematic proposal, nor even folklore among experts, of what primitives and performance characteristics are sufficient for a performant proof engine. We hope to start a discussion on the obvious corollary of this thesis: This should not be! Our presentation, we hope, will serve as a call for a POPLMark for Proof Engines, a call for designing and implementing a proof engine for scalable performant modular proof automation.

[11] Jason Gross, Théo Zimmermann, Miraya Poddar-Agrawal, and Adam Chlipala. Automatic test-case reduction in proof assistants: A case study in Coq. In June Andronick and Leonardo de Moura, editors, Proceedings of the 13th International Conference on Interactive Theorem Proving (ITP 2022), volume 237 of Leibniz International Proceedings in Informatics (LIPIcs), pages 18:1--18:18, Dagstuhl, Germany, August 2022. Schloss Dagstuhl -- Leibniz-Zentrum für Informatik. [ bib | DOI | publication (dblp LIPIcs) | presentation (.pdf) | presentation (YouTube) | presentation (.pptx) | presentation (.pptx, annotated with notes) | original conference submission (.pdf) | artifact (Figshare) | project (GitHub) | .pdf ]

As the adoption of proof assistants increases, there is a need for efficiency in identifying, documenting, and fixing compatibility issues that arise from proof-assistant evolution. We present the Coq Bug Minimizer, a tool for reproducing buggy behavior with minimal and standalone files, integrated with coqbot to trigger automatically on failures from Coq's reverse dependency compatibility testing. Our tool eliminates the overhead of having to download, set up, compile, and then explore and understand large developments, enabling Coq developers to easily obtain modular test-case files for fast experimentation. In this paper, we describe insights about how test-case reduction is different in Coq than in traditional compilers. We expect that our insights will generalize to other proof assistants. We evaluate the Coq Bug Minimizer on over 150 compatibility testing failures. Our tool succeeds in reducing failures to smaller test cases roughly 75% of the time. The minimizer produces a fully standalone test case 89% of the time, and it is on average about one-third the size of the original test. The average reduced test case compiles in 1.25 seconds, with 75% taking under half a second.

[12] Jason Gross, Andres Erbsen, Jade Philipoom, Miraya Poddar-Agrawal, and Adam Chlipala. Accelerating verified-compiler development with a verified rewriting engine. In June Andronick and Leonardo de Moura, editors, Proceedings of the 13th International Conference on Interactive Theorem Proving (ITP 2022), volume 237 of Leibniz International Proceedings in Informatics (LIPIcs), pages 17:1--17:18, Dagstuhl, Germany, August 2022. Schloss Dagstuhl -- Leibniz-Zentrum für Informatik. [ bib | DOI | arXiv | publication (dblp LIPIcs) | presentation (.pdf) | presentation (YouTube) | presentation (.pptx) | presentation (.pptx, annotated with notes) | original conference submission (.pdf) | project (GitHub) | artifact (.tar.gz) | .pdf ]

Compilers are a prime target for formal verification, since compiler bugs invalidate higher-level correctness guarantees, but compiler changes may become more labor-intensive to implement, if they must come with proof patches. One appealing approach is to present compilers as sets of algebraic rewrite rules, which a generic engine can apply efficiently. Now each rewrite rule can be proved separately, with no need to revisit past proofs for other parts of the compiler. We present the first realization of this idea, in the form of a framework for the Coq proof assistant. Our new Coq command takes normal proved theorems and combines them automatically into fast compilers with proofs. We applied our framework to improve the Fiat Cryptography toolchain for generating cryptographic arithmetic, producing an extracted command-line compiler that is about 1000× faster while actually featuring simpler compiler-specific proofs.

[13] Jason S. Gross. Performance Engineering of Proof-Based Software Systems at Scale. PhD thesis, Massachusetts Institute of Technology, February 2021. [ bib | DSpace@MIT | presentation (.pdf) | presentation (YouTube) | presentation (.pptx) | .pdf ]

Formal verification is increasingly valuable as our world comes to rely more on software for critical infrastructure. A significant and understudied cost of developing mechanized proofs, especially at scale, is the computer performance of proof generation. This dissertation aims to be a partial guide to identifying and resolving performance bottlenecks in dependently typed tactic-driven proof assistants like Coq. We present a survey of the landscape of performance issues in Coq, with micro- and macro-benchmarks. We describe various metrics that allow prediction of performance, such as term size, goal size, and number of binders, and note the occasional surprising lack of a bottleneck for some factors, such as total proof term size. To our knowledge such a roadmap to performance bottlenecks is a new contribution of this dissertation. The central new technical contribution presented by this dissertation is a reflective framework for partial evaluation and rewriting, already used to compile a code generator for field-arithmetic cryptographic primitives which generates code currently used in Google Chrome. We believe this prototype is the first scalably performant realization of an approach for code specialization which does not require adding to the trusted code base. Our extensible engine, which combines the traditional concepts of tailored term reduction and automatic rewriting from hint databases with on-the-fly generation of inductive codes for constants, is also of interest to replace these ingredients in proof assistants' proof checkers and tactic engines. Additionally, we use the development of this framework itself as a case study for the various performance issues that can arise when designing large proof libraries. We also present a novel method of simple and fast reification, developed and published during this PhD. Finally, we present additional lessons drawn from the case studies of a category-theory library, a proof-producing parser generator, and cryptographic code generation.

[14] Jason Gross. A limited case for reification by type inference, January 2021. Presented at The Seventh International Workshop on Coq for Programming Languages (CoqPL'21). [ bib | presentation (YouTube) | code (.v) | .pdf ]

Proof by reflection is a common and well-studied automation tool. Reification---generally written using Ltac, OCaml, typeclasses, or canonical structures---is the means by which a structured representation is derived from an unstructured representation. The reflective automation then operates on the structured representation, relying on an interpretation or denotation function to justify a correspondence between the structured and unstructured representations. A couple of years ago, I presented a trick for blazing fast reification in two lines of Ltac---using the pattern tactic---which I termed reification by parametricity. While I still advocate for parametricity as the preferred method of domain-specific reification, I would like to present here yet another method. While reification typically requires meta-programming features, I was surprised and delighted to discover that, in some restricted cases, reification can be performed entirely by a combination of the notation system and type inference. In some sense, this is trivial: by redefining the basic syntactic notations, a term can be “reified” merely by writing the same symbols in another scope. In another sense, though, this trick is quite surprising: we use the notation system merely to insert “reify here” functions at every atom, and the reification itself is in fact performed by type inference. My hope is that the audience will walk away with this new trick in their toolbox, and that some day some problem will come along demanding a slight generalization of this trick, and that generalization will be new and interesting in its own right. This, after all, is how reification by parametricity was discovered. I propose to present the one example I have for this trick: reifying the type structure of a function in a way that allows manipulations of the arguments, such as uncurrying, reassociation of the uncurried structure, and reordering. I will present the simple code for this example in detail. My goal will be that the audience understand completely how it works, why it works, and how it might be used elsewhere.

[15] Linden B. Huhmann, Charles F. Harvey, Jason Gross, Anjal Uddin, Imtiaz Choudhury, Kazi M. Ahmed, John M. Duxbury, Benjamin Bostick, and Alexander van Geen. Evaluation of a field kit for testing arsenic in paddy soil contaminated by irrigation water. Geoderma, 382:114755, January 2021. [ bib | DOI | publication (ScienceDirect) ]

Rice is the primary crop in Bangladesh and rice yield is diminished due to the buildup of arsenic (As) in soil from irrigation with high-As groundwater. Soil testing with an inexpensive kit could help farmers target high-As soil for mitigation or decide to switch to a different crop that is less sensitive to As in soil. A total of 3240 field kit measurements of As in 0.5 g of fresh soil added to 50 mL of water were compared with total soil As concentrations measured on oven-dried homogenized soil by X-ray fluorescence (XRF). For sets of 12 soil samples collected within a series of rice fields, the average of kit As measurements was a linear function of the average of XRF measurements (r2 = 0.69). Taking into account that the kit overestimates water As concentrations by about a factor of two, the relationship suggests that about a quarter of the As in paddy soil is released in the kit’s reaction vessel. Using the relationship and considering XRF measurements as the reference, the 12-sample average determined correctly whether soil As was above or below a 30 mg/kg threshold in 86% of cases where soil As was above the threshold and in 79% of cases where soil As was below the threshold. We also used a Bayesian approach using 12 kit measurements to estimate the probability that soil As was above a given threshold indicated by XRF measurements. The Bayesian approach is theoretically optimal but was only slightly more accurate than the linear regression. These results show that rice farmers can identify high-As portions of their fields for mitigation using a dozen field kit measurements on fresh soil and base their decisions on this information.

Keywords: Field kit, Rice paddy, Irrigation, Arsenic

[16] Clément Pit-Claudel, Peng Wang, Benjamin Delaware, Jason Gross, and Adam Chlipala. Extensible extraction of efficient imperative programs with foreign functions, manually managed memory, and proofs. In Nicolas Peltier and Viorica Sofronie-Stokkermans, editors, Proceedings of the 9th International Joint Conference on Automated Reasoning (IJCAR'20), pages 119--137, Cham, June 2020. Springer International Publishing. [ bib | DOI | publication (
Springer
)
 | project (GitHub) ]

We present an original approach to sound program extraction in a proof assistant, using syntax-driven automation to derive correct-by-construction imperative programs from nondeterministic functional source code. Our approach does not require committing to a single inflexible compilation strategy and instead makes it straightforward to create domain-specific code translators. In addition to a small set of core definitions, our framework is a large, user-extensible collection of compilation rules each phrased to handle specific language constructs, code patterns, or data manipulations. By mixing and matching these pieces of logic, users can easily tailor extraction to their own domains and programs, getting maximum performance and ensuring correctness of the resulting assembly code. Using this approach, we complete the first proof-generating pipeline that goes automatically from high-level specifications to assembly code. In our main case study, the original specifications are phrased to resemble SQL-style queries, while the final assembly code does manual memory management, calls out to foreign data structures and functions, and is suitable to deploy on resource-constrained platforms. The pipeline runs entirely within the Coq proof assistant, leading to final, linked assembly code with overall full-functional-correctness proofs in separation logic.

[17] Andres Erbsen, Jade Philipoom, Jason Gross, Robert Sloan, and Adam Chlipala. Simple high-level code for cryptographic arithmetic -- with proofs, without compromises. In Proceedings of the 40th IEEE Symposium on Security and Privacy (S&P'19), May 2019. [ bib | DOI | ACM DL Author-ize Publication | project (GitHub) | .pdf ]

We introduce a new approach for implementing cryptographic arithmetic in short high-level code with machine-checked proofs of functional correctness. We further demonstrate that simple partial evaluation is sufficient to transform into the fastest-known C code, breaking the decades-old pattern that the only fast implementations are those whose instruction-level steps were written out by hand. These techniques were used to build an elliptic-curve library that achieves competitive performance for 80 prime fields and multiple CPU architectures, showing that implementation and proof effort scales with the number and complexity of conceptually different algorithms, not their use cases. As one outcome, we present the first verified high-performance implementation of P-256, the most widely used elliptic curve. Implementations from our library were included in BoringSSL to replace existing specialized code, for inclusion in several large deployments for Chrome, Android, and CloudFlare.

[18] Jason Gross, Andres Erbsen, and Adam Chlipala. Reification by parametricity: Fast setup for proof by reflection, in two lines of Ltac. In Jeremy Avigad and Assia Mahboubi, editors, Proceedings of the 9th International Conference on Interactive Theorem Proving (ITP'18), pages 289--305, Cham, July 2018. Springer International Publishing. [ bib | DOI | publication (
Springer
)
 | presentation (.pdf) | presentation (.pptx, annotated with notes) | original conference submission (.pdf) | artifact (GitHub) | artifact (.tar.gz) | .pdf ]

We present a new strategy for performing reification in Coq. That is, we show how to generate first-class abstract syntax trees from “native” terms of Coq's logic, suitable as inputs to verified compilers or procedures in the proof-by-reflection style. Our new strategy, based on simple generalization of subterms as variables, is straightforward, short, and fast. In its pure form, it is only complete for constants and function applications, but “let” binders, eliminators, lambdas, and quantifiers can be accommodated through lightweight coding conventions or preprocessing. We survey the existing methods of reification across multiple Coq metaprogramming facilities, describing various design choices and tricks that can be used to speed them up, as well as various limitations. We report benchmarking results for 18 variants, in addition to our own, finding that our own reification outperforms 16 of these methods in all cases, and one additional method in some cases; writing an OCaml plugin is the only method tested to be faster. Our method is the most concise of the strategies we considered, reifying terms using only two to four lines of Ltac---beyond lists of the identifiers to reify and their reified variants. Additionally, our strategy automatically provides error messages that are no less helpful than Coq's own error messages.

[19] Jason Gross. Presentation proposal for teaching your rooster to crow in C, July 2018. Presented at The Coq Workshop 2018. [ bib | code (.html) | code (.v) | .pdf ]

Coq's notation system is both extremely powerful and confusingly ad-hoc. While powerful enough to pretty-print abstract syntax trees in most domain-specific languages, how to do so does not seem to be common knowledge. Typical questions arising from such an endeavor might include “How do I pick notation levels?”, “Why are these notations clashing?”, “Which things should be marked as symbols?”, “How do I use boxes in format?”, and “How do I get parentheses to show up (only) where I want them to?” This interactive presentation aims to serve as a guide to these questions and more, by demonstrating and explaining how to pretty-print subsets of C using only Coq's Notation mechanism.

[20] Adam Chlipala, Benjamin Delaware, Samuel Duchovni, Jason Gross, Clément Pit-Claudel, Sorawit Suriyakarn, Peng Wang, and Katherine Ye. The end of history? using a proof assistant to replace language design with library design. In Benjamin S. Lerner, Rastislav Bodík, and Shriram Krishnamurthi, editors, Proceedings of the The 2nd Summit oN Advances in Programming Languages (SNAPL'17), volume 71 of Leibniz International Proceedings in Informatics (LIPIcs), pages 3:1--3:15, Dagstuhl, Germany, May 2017. Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik. [ bib | DOI | project homepage | presentation (.odp) | presentation (.pdf) | .pdf ]

Functionality of software systems has exploded in part because of advances in programming-language support for packaging reusable functionality as libraries. Developers benefit from the uniformity that comes of exposing many interfaces in the same language, as opposed to stringing together hodgepodges of command-line tools. Domain-specific languages may be viewed as an evolution of the power of reusable interfaces, when those interfaces become so flexible as to deserve to be called programming languages. However, common approaches to domain-specific languages give up many of the hard-won advantages of library-building in a rich common language, and even the traditional approach poses significant challenges in learning new APIs. We suggest that instead of continuing to develop new domain-specific languages, our community should embrace library-based ecosystems within very expressive languages that mix programming and theorem proving. Our prototype framework Fiat, a library for the Coq proof assistant, turns languages into easily comprehensible libraries via the key idea of modularizing functionality and performance away from each other, the former via macros that desugar into higher-order logic and the latter via optimization scripts that derive efficient code from logical programs.

Keywords: domain-specific languages, synthesis, verification, proof assistants, software development

[21] Andrej Bauer, Jason Gross, Peter LeFanu Lumsdaine, Michael Shulman, Matthieu Sozeau, and Bas Spitters. The HoTT library: A formalization of homotopy type theory in Coq. In Proceedings of the 6th ACM SIGPLAN Conference on Certified Programs and Proofs, CPP 2017, pages 164--172, New York, NY, USA, January 2017. ACM. [ bib | DOI | arXiv | ACM DL Author-ize Publication | .pdf ]

We report on the development of the HoTT library, a formalization of homotopy type theory in the Coq proof assistant. It formalizes most of basic homotopy type theory, including univalence, higher inductive types, and significant amounts of synthetic homotopy theory, as well as category theory and modalities. The library has been used as a basis for several independent developments. We discuss the decisions that led to the design of the library, and we comment on the interaction of homotopy type theory with recently introduced features of Coq, such as universe polymorphism and private inductive types.

Keywords: Coq, Higher inductive types, Homotopy type theory, Univalent foundations, Universe polymorphism

[22] Jason Gross. The HoTT/HoTT library in Coq: Designing for speed, July 2016. Presented at The 5th International Congress on Mathematical Software (ICMS 2016). [ bib | presentation (.pptx, annotated with notes) | .pdf ]

The HoTT/HoTT library is one of the major Coq libraries exploring univalent foundations and homotopy type theory, the other being UniMath. The library includes formalization of the basic type formers, some axiomatic higher inductive types including the circle, the interval, suspensions, and quotients, a formalization of modalities (reflective subtoposes) using modules as a way to quantify over all universe levels, formalizations of Cantor spaces and the surreals, the basic theory of h-levels, and a significant amount of category theory centered around comma categories and functoriality of various constructions involving comma categories. A significant amount of work has gone into ensuring that the library compiles quickly. This talk will discuss the various constructions in the HoTT library, as well as the design choices and features, both of Coq and of univalent type theory, which allow our library to compile and typecheck quickly.

[23] Jason Gross. An extensible framework for synthesizing efficient, verified parsers. Master's thesis, Massachusetts Institute of Technology, September 2015. [ bib | DSpace@MIT | .pdf ]

Parsers have a long history in computer science. This thesis proposes a novel approach to synthesizing efficient, verified parsers by refinement, and presents a demonstration of this approach in the Fiat framework by synthesizing a parser for arithmetic expressions. The benefits of this framework may include more flexibility in the parsers that can be described, more control over the low-level details when necessary for performance, and automatic or mostly automatic correctness proofs.

[24] Tobias Tebbi and Jason Gross. A profiler for Ltac, January 2015. Presented at The First International Workshop on Coq for PL (CoqPL'15). [ bib | .pdf ]

We present a simple profiler for the Ltac tactic language of the Coq Proof Assistent. It measures the time spent in invocations of primitive tactics as well as tactics defined in Ltac and their inner invocations. The profiler is controlled using Vernacular commands and prints an aggregated view that differentiates between tactic invocations depending on their call tree location.

[25] Jason Gross. Coq bug minimizer, January 2015. Presented at The First International Workshop on Coq for PL (CoqPL'15). [ bib | reviews | .pdf ]

Are bugs the bane of your existence? Do you dread Coq upgrades, because they mean you'll have to spend days tracking down subtle failures deep in your developments? Have you ever hit an anomaly that just wouldn't go away, and wished you understood what triggered it? Have you ever been tormented by two blocks of code that looked identical, but behaved differently? Do you wish you submit more helpful error reports, but don't want to put in the time to construct minimal examples? If you answered “yes” to any of these questions, then the Coq Bug Minimizer is for you! Clone your own copy at https://github.com/JasonGross/coq-bug-finder.

[26] Ben Delaware, Clément Pit-Claudel, Jason Gross, and Adam Chlipala. Fiat: Deductive synthesis of abstract data types in a proof assistant. In Proceedings of the 42nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL'15), January 2015. [ bib | DOI | ACM DL Author-ize Publication | project homepage | artifact (.tar.gz) | .pdf ]

We present Fiat, a library for the Coq proof assistant supporting refinement of declarative specifications into efficient functional programs with a high degree of automation. Each refinement process leaves a proof trail, checkable by the normal Coq kernel, justifying its soundness. We focus on the synthesis of abstract data types that package methods with private data. We demonstrate the utility of our framework by applying it to the synthesis of query structures -- abstract data types with SQL-like query and insert operations. Fiat includes a library for writing specifications of query structures in SQL-inspired notation, expressing operations over relations (tables) in terms of mathematical sets. This library includes a set of tactics for automating the refinement of these specifications into efficient, correct-by-construction OCaml code. Using these tactics, a programmer can generate such an implementation completely automatically by only specifying the equivalent of SQL indexes, data structures capturing useful views of the abstract data. We conclude by speculating on the new programming modularity possibilities enabled by an automated refinement system with proved-correct rules.

[27] Jason Gross. Presentation: Input, output, and automation in x86 proved, August 2014. Presented at Microsoft Research, Cambridge, UK. [ bib | .pptx | .pdf ]

The x86proved project can now verify assembly programs with input and output! The code-reasoning throughout the project is now mostly automatic. Although not yet push-button verification (specification-level reasoning, in particular, leaves a lot to be desired) these tactics make a significant step towards that goal. This presentation will cover:
• some programs whose I/O behaviour has been verified (including a simplified version of the echo command-line tool)
• the new automation for fully automatic correctness proofs of Hoare-triple rules for basic instructions
• the new automation for applying Hoare rules for assembly instructions automatically
• the basics of how we're specifying and verifying the I/O behaviour of programs

[28] Jason Gross, Adam Chlipala, and David I. Spivak. Experience implementing a performant category-theory library in Coq. In Gerwin Klein and Ruben Gamboa, editors, Proceedings of the 5th International Conference on Interactive Theorem Proving (ITP'14), pages 275--291, Cham, July 2014. Springer International Publishing. [ bib | DOI | arXiv | publication (
Springer
)
 | presentation (.pdf) | presentation (.pptx, annotated with notes) | original conference submission (.pdf) | full bibliography | reviews | .pdf ]

We describe our experience implementing a broad category-theory library in Coq. Category theory and computational performance are not usually mentioned in the same breath, but we have needed substantial engineering effort to teach Coq to cope with large categorical constructions without slowing proof script processing unacceptably. In this paper, we share the lessons we have learned about how to represent very abstract mathematical objects and arguments in Coq and how future proof assistants might be designed to better support such reasoning. One particular encoding trick to which we draw attention allows category-theoretic arguments involving duality to be internalized in Coq's logic with definitional equality. Ours may be the largest Coq development to date that uses the relatively new Coq version developed by homotopy type theorists, and we reflect on which new features were especially helpful.

[29] Jason Gross. Presentation proposal for three neat tricks in Coq 8.5, April 2014. Presented at the 6th Coq Workshop. [ bib | code (.html) | code (.v) | reviews | .pdf ]

Coq 8.5 has a number of new features. It has more powerful universe polymorphism support. It allows tactics to be run at interpretation to construct other terms. The ability to switch from Gallina to Ltac in arbitrary locations nicely complements the constr: notation permitting the switch from Ltac to Gallina in tactics, and opens up many new possibilities. I propose to present three tricks involving these new features: tactics in terms allows the construction of tactics that recurse under binders; tactics in terms together with typeclasses allows overloading notations based on the type of their arguments; and there is a way to talk about universe levels explicitly, helped along by tactics in terms.

[30] Jason Gross. POPL: Minute madness: Category theory in Coq, and program synthesis, January 2014. Presented at the 41st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL'14). [ bib | .pdf ]
[31] Jason Gross. Jason Gross' wishlist for Coq, January 2014. [ bib | .pdf ]
[32] Jason Gross. CSAIL student workshop 2013: Computational higher inductive types: Computing with custom equalities, October 2013. Presented at the 2014 MIT CSAIL Student Workshop. [ bib | .pdf ]
[33] Jason Gross. POPL: Minute madness: Database management on top of category theory in Coq: Category of relational schemas = category of categories, January 2013. Presented at the 40th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL'13). [ bib | .pdf ]
[34] Jason Gross. Building database management on top of category theory in Coq, January 2013. Presented as a student talk at the 40th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL'13). [ bib | .pdf ]
[35] Brenden M. Lake, Ruslan Salakhutdinov, Jason Gross, and Joshua B. Tenenbaum. One shot learning of simple visual concepts. In Proceedings of the 33rd Annual Conference of the Cognitive Science Society, 2011. [ bib | project homepage | videos | .pdf ]

People can learn visual concepts from just one example, but it remains a mystery how this is accomplished. Many authors have proposed that transferred knowledge from more familiar concepts is a route to one shot learning, but what is the form of this abstract knowledge? One hypothesis is that the sharing of parts is core to one shot learning, and we evaluate this idea in the domain of handwritten characters, using a massive new dataset. These simple visual concepts have a rich internal part structure, yet they are particularly tractable for computational models. We introduce a generative model of how characters are composed from strokes, where knowledge from previous characters helps to infer the latent strokes in novel characters. The stroke model outperforms a competing state-of-the-art character model on a challenging one shot learning task, and it provides a good fit to human perceptual data.

This reference list was generated by bibtex2html 1.99.

Drafts

[1] Jason Gross, Jack Gallagher, and Benya Fallenstein. Löb's theorem: A functional pearl of dependently typed quining, March 2016. Submitted to ICFP 2016. [ bib | project (GitHub) | artifact (.zip) | code (.agda) | code (.html) | bibliography | .pdf ]

Löb's theorem states that to prove that a proposition is provable, it is sufficient to prove the proposition under the assumption that it is provable. The Curry-Howard isomorphism identifies formal proofs with abstract syntax trees of programs; Löb's theorem thus implies, for total languages which validate it, that self-interpreters are impossible. We formalize a few variations of Löb's theorem in Agda using an inductive-inductive encoding of terms indexed over types. We verify the consistency of our formalizations relative to Agda by giving them semantics via interpretation functions.

[2] Jason Gross and Adam Chlipala. Parsing parses: A pearl of (dependently typed) programming and proof, August 2015. Submitted to ICFP 2015. [ bib | .pdf ]

We present a functional parser for arbitrary context-free grammars, together with soundness and completeness proofs, all inside Coq. By exposing the parser in the right way with parametric polymorphism and dependent types, we are able to use the parser to prove its own soundness, and, with a little help from relational parametricity, prove its own completeness, too. Of particular interest is one strange instantiation of the type and value parameters: by parsing parse trees instead of strings, we convince the parser to generate its own completeness proof. We conclude with highlights of our experiences iterating through several versions of the Coq development, and some general lessons about dependently typed programming.

This reference list was generated by bibtex2html 1.99.