(feat) Type Inference #79

AlSchlo · 2025-04-21T21:11:23Z

Overview

This PR brings in the long awaited type inference to the DSL. This not only makes rules more ergonomic to write, but it is also needed for correctness in the optimizer.

The engine needs to know what types are Logical or Physical, as they behave differently in the engine. Thus, all expressions must be resolved to a type.

Furthermore, type inference is also used to verify if field accesses and function calls are valid (as these are type-dependent checks).

Some examples:

Strategy

The type inference works in three phases.

In from_ast, during the initial AST -> HIR<TypedSpan> transformation, we create and add all implicit and explicit type information from the program (e.g. Literals like 1 or "hello", function annotations, etc.). For types that are Unknown, we generate a new ID and assign the type to either Descending (for unknown closure parameters and map keys), or Ascending (for all the rest) — more details about these modes in (3).
Constraints are generated in the generate.rs file, which also performs scope-checking now (for convenience, as both code paths were extremely similar). Constraints indicate subtype relationship, field_accesses, and function calls.

For example:

let a: Logical = expr
generates the constraint
Logical :> typeof(expr)

The last step to resolve the unknown types is to use a solver (credit to @AarryaSaraf for the base algorithm idea). It works as follows:

// Pseudocode for the constraint solver
function resolve():
    anyChanged = true
    lastError = null

    // Keep iterating until we reach a fixed point (no more changes)
    while anyChanged:
        anyChanged = false
        lastError = null
        
        // Check each constraint and try to refine unknown types
        for each constraint in constraints:
            result = checkConstraint(constraint)
            
            if result is Ok(changed):
                anyChanged |= changed  // Track if any types were refined
            else if result is Err(error, changed):
                anyChanged |= changed  // Still track changes
                lastError = error      // Remember the last error
        
    // After reaching fixed point, return error if any constraint failed
    return lastError ? Err(lastError) : Ok()

During type inference, the method refines unknown types to satisfy subtyping constraints according to their variance:

When an UnknownAsc type is encountered as a parent, it is updated to the least upper bound (LUB) of itself and the child type. These types start at Nothing and ascend up the type hierarchy as needed.
When an UnknownDesc type is encountered as a child, it is updated to the greatest lower bound (GLB) of itself and the parent type. These types start at Universe and descend down the type hierarchy as needed.
When an UnknownAsc type is encountered as a child, its resolved type is checked against the parent type.
When an UnknownDesc type is encountered as a parent, its resolved type is checked against the child type.

This refinement process happens iteratively until the system reaches a stable state where no more unknown types can be refined. At that point, if any constraints remain unsatisfied, the solver reports the most relevant type error.

The key insight of this algorithm is that it makes monotonic progress - each refinement step either:

Successfully resolves a constraint
Refines an unknown type to be more specific
Identifies a type error

By tracking whether any types changed during each iteration and continuing until we reach a fixed point, we ensure all types are resolved as completely as possible before reporting any errors.

Limitations

While the algorithm is theoretically correct, it has the following limitations:

Its run time appears to be quadratic. This is not a problem for small programs but might become a compilation bottleneck in the future. Excellent heuristics exist to optimize the order in which constraints get applied.
As commented in the code, when resolving the constraint:

UnknownDesc <: UnknownAsc

We could either dump the left type to Nothing, or pump the right type to Universe. However, since pumping/dumping types cannot be undone later on (to avoid exponential run-time), it is possible that we over-dump/pump a specific type. A solution would be to ignore these constraints until all other constraints that can be safely applied have run out. This would result in much better empirical type inference.

We postpone as future work (Finalize Type Inference #81):

Map Concat as the keys are contra-variant let map: {Animal : I64} = {Dog : 3} ++ {Cat : 2} would fail under the current type checker.
List pattern matching is still broken.
Generic functions are not yet supported.

All these above points may be solved by adding specific constraints for each of these, like we did for field_access and call.

Some error messages are a bit confusing to understand, although a lot of effort has been done to make them already better (e.g. see examples above). There is no silver bullet here: probably improving the span handling from the parser is the correct way forward.

Testing

Given how hard (and bloated!) it is to test each point in isolation, we simply test whether fully written-out programs pass the type checker or not in solver.rs.

Error reporting has been tested manually for a variety of programs, and is expected to improve as we start writing rules.

Future work

Focus will be put on the final HIR compilation process, which needs to correctly encode identified Logical / Physical types and reject ambiguous (albeit correctly) inferred types (i.e. Nothing or Universe). It is indeed better practice to enforce type annotations in these scenarios.

Note to Reviewers

Don't read the diff, just read the entire analyzer/types directory.

codecov-commenter · 2025-04-24T16:18:40Z

Codecov Report

Attention: Patch coverage is 88.79593% with 308 lines in your changes missing coverage. Please review.

Project coverage is 88.3%. Comparing base (70dbd27) to head (6c6659a).

Files with missing lines	Patch %	Lines
optd/src/dsl/analyzer/errors.rs	8.9%	112 Missing ⚠️
optd/src/dsl/analyzer/types/registry.rs	55.9%	63 Missing ⚠️
optd/src/dsl/analyzer/types/solver.rs	88.0%	53 Missing ⚠️
optd/src/dsl/analyzer/types/generate.rs	81.4%	29 Missing ⚠️
optd/src/dsl/analyzer/types/glb.rs	95.0%	20 Missing ⚠️
optd/src/dsl/analyzer/types/lub.rs	96.9%	14 Missing ⚠️
optd/src/dsl/analyzer/types/subtype.rs	98.3%	13 Missing ⚠️
optd/src/dsl/analyzer/from_ast/expr.rs	89.6%	3 Missing ⚠️
optd/src/dsl/compile.rs	90.9%	1 Missing ⚠️

Additional details and impacted files

Files with missing lines	Coverage Δ
optd/src/dsl/analyzer/context.rs	`89.8% <100.0%> (+0.7%)`	⬆️
optd/src/dsl/analyzer/from_ast/converter.rs	`97.5% <100.0%> (ø)`
optd/src/dsl/analyzer/from_ast/pattern.rs	`94.5% <100.0%> (+0.2%)`	⬆️
optd/src/dsl/analyzer/from_ast/types.rs	`97.6% <100.0%> (+1.5%)`	⬆️
optd/src/dsl/analyzer/hir.rs	`77.5% <ø> (ø)`
optd/src/dsl/analyzer/semantic_checks/adt_check.rs	`98.2% <100.0%> (ø)`
optd/src/dsl/parser/expr.rs	`81.4% <100.0%> (+0.1%)`	⬆️
optd/src/dsl/utils/span.rs	`78.9% <100.0%> (+7.5%)`	⬆️
optd/src/dsl/compile.rs	`65.4% <90.9%> (+65.4%)`	⬆️
optd/src/dsl/analyzer/from_ast/expr.rs	`94.5% <89.6%> (+<0.1%)`	⬆️
... and 7 more

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

optd/src/dsl/analyzer/types/registry.rs

connortsui20

I don't have time to go through the whole algorithm so just doing a quick look over some of the code and tests everything looks fine. I have other concerns about Rust-related things but we have more important things to worry about.

AlSchlo and others added 25 commits April 20, 2025 00:35

Add span in type and make merge with universe an error

d3fab8b

clippy and fmt

41b01d7

Better type span error reporting

22c1dee

Fix postfix span

c96008d

Fix binary span bug

bb76ccd

Add descending types

1aa4da1

Fix glb+lub bug

2ba9bd3

Fix map constraints

0848e65

Fix list pattern inference

d16198c

Add call constraint without generics

ddf1753

clippy

b67a780

Correct call constraint generation

0715780

nits

396bf54

clippy

41caebd

fix bug in solver

3a34a39

Fix display with >=

ac5e698

Add back descending

7de2f4f

Add extra example file

b292ead

Fix annotation bug

4b8fc0a

nits

9419afe

simplify return type

55b81a5

Fix closure contraints return type and add #80 description

3aaf6c9

Fix list pattern matching

956112e

Fix & refactor type inference

4569e37

Final fix of type inference before testing

039e2f6

AlSchlo marked this pull request as ready for review April 24, 2025 00:08

AlSchlo changed the title ~~Alexis/type infer 3~~ (feat) Type Inference Apr 24, 2025

AlSchlo added 2 commits April 24, 2025 11:46

Add type checker tests

0365f29

merge

c761ef8

AlSchlo mentioned this pull request Apr 24, 2025

Finalize Type Inference #81

Open

Merge branch 'main' into alexis/type-infer-3

6c6659a

AlSchlo requested review from connortsui20 and SarveshOO7 and removed request for connortsui20 April 24, 2025 19:47

AlSchlo self-assigned this Apr 24, 2025

connortsui20 reviewed Apr 24, 2025

View reviewed changes

optd/src/dsl/analyzer/types/registry.rs Show resolved Hide resolved

connortsui20 approved these changes Apr 24, 2025

View reviewed changes

AlSchlo merged commit d4898a8 into main Apr 24, 2025
12 checks passed

AlSchlo deleted the alexis/type-infer-3 branch April 24, 2025 23:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(feat) Type Inference #79

(feat) Type Inference #79

AlSchlo commented Apr 21, 2025 •

edited

Loading

codecov-commenter commented Apr 24, 2025 •

edited

Loading

connortsui20 left a comment •

edited

Loading

(feat) Type Inference #79

(feat) Type Inference #79

Conversation

AlSchlo commented Apr 21, 2025 • edited Loading

Overview

Strategy

Limitations

Testing

Future work

Note to Reviewers

codecov-commenter commented Apr 24, 2025 • edited Loading

Codecov Report

connortsui20 left a comment • edited Loading

Choose a reason for hiding this comment

AlSchlo commented Apr 21, 2025 •

edited

Loading

codecov-commenter commented Apr 24, 2025 •

edited

Loading

connortsui20 left a comment •

edited

Loading