Skip to content
Merged
Show file tree
Hide file tree
Changes from 27 commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
1082625
analyze: initial commit
spernsteiner Mar 14, 2022
03c338f
analyze: run polonius and dump outputs
spernsteiner Mar 15, 2022
7566581
analyze: generate var_defined/used_at facts
spernsteiner Mar 15, 2022
9c8f495
analyze: generate path_assigned/moved/accessed_at facts
spernsteiner Mar 15, 2022
66a550b
analyze: generate use_of_var_derefs_region facts
spernsteiner Mar 17, 2022
0ba67e3
analyze: generate load_issued/invalidated_at facts
spernsteiner Mar 17, 2022
c2d619a
analyze: generate subset_base facts
spernsteiner Mar 21, 2022
304b1c9
analyze: refactor: move polonius stuff into new borrowck module
spernsteiner Mar 28, 2022
e86eff8
analyze: run polonius on a hypothetical rewrite
spernsteiner Apr 7, 2022
9500bdd
analyze: remove UNIQUE permission on error and propagate
spernsteiner Apr 14, 2022
661188e
analyze: handle reborrows when updating UNIQUE
spernsteiner Apr 18, 2022
d75e429
analyze: infer READ and WRITE permissions
spernsteiner May 2, 2022
b76966f
analyze: propagate permissions through offset() calls
spernsteiner May 2, 2022
f7b329c
analyze: require OFFSET_ADD and OFFSET_SUB for offset() args
spernsteiner May 2, 2022
de2dcee
analyze: handle constructs seen in insertion_sort
spernsteiner May 2, 2022
b8d0dac
analyze: include variable names and source code in "final labeling" o…
spernsteiner May 5, 2022
f30b994
analyze: adjust "final labeling" output for long expressions
spernsteiner May 6, 2022
168c24d
analyze: convert examples to a proper test suite
spernsteiner May 6, 2022
b17a294
analyze: add test for p.offset(..).offset(..)
spernsteiner May 6, 2022
6e8ca3f
analyze: refactor dataflow update tracking
spernsteiner May 26, 2022
35d8926
analyze: refactor dataflow propagation to allow pluggable update rules
spernsteiner May 26, 2022
f475677
analyze: determine which pointers should use &Cell<T>
spernsteiner May 26, 2022
c455f36
analyze: compute new types based on pointer permissions
spernsteiner Jun 14, 2022
56e9b86
analyze: handle Deref and Field projections in Context::type_of
spernsteiner Jun 30, 2022
2def6de
analyze: generate and print expr rewrites
spernsteiner Jun 30, 2022
f212c9a
analyze: fix filecheck test runner failing to find c2rust-analyze binary
spernsteiner Jul 12, 2022
e9c7f65
analyze: use shared rust-toolchain
spernsteiner Jul 12, 2022
dbe3e7b
analyze: remove /target/ from .gitignore
spernsteiner Jul 18, 2022
72cbf53
analyze: add c2rust-analyze to workspace
spernsteiner Jul 18, 2022
6356224
analyze: add rust-analyzer section to Cargo.toml
spernsteiner Jul 18, 2022
722c0c0
Add `*.rlib` to the `c2rust-analyze` `.gitignore` as it's produced by…
kkysen Jul 18, 2022
3dcade1
analyze: fix warnings
spernsteiner Jul 19, 2022
0fa9ab3
analyze: fix warnings in tests
spernsteiner Jul 19, 2022
5f70d42
analyze: use rustc-private-link to find rustc library paths
spernsteiner Jul 19, 2022
6218531
analyze: update README
spernsteiner Jul 19, 2022
0a0ab79
analyze: change cfg for extra debugging code in tests/filecheck/*.rs
spernsteiner Jul 20, 2022
d3a0dc2
analyze: autodetect path to FileCheck in tests
spernsteiner Jul 21, 2022
b5a15ce
analyze: cargo fmt
spernsteiner Jul 28, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions c2rust-analyze/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
/target/
/inspect/
13 changes: 13 additions & 0 deletions c2rust-analyze/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
[package]
name = "c2rust-analyze"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
polonius-engine = "0.13.0"
rustc-hash = "1.1.0"
bitflags = "1.3.2"

[workspace]
11 changes: 11 additions & 0 deletions c2rust-analyze/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
```sh
rustc --crate-type rlib instrument_support.rs
cargo run -- fib.rs -L ~/.rustup/toolchains/nightly-2022-02-14-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See rustc_private_link::SysRoot::link_rustc_private for how we automated this -L ~/.rustup/toolchains/nightly-2022-02-14-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/ part in build.rss. It might be able to avoid having to specify this here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the readme (which was horribly out of date) but left the big -L path in there. I'm not sure what's the right solution for setting the default library paths (e.g. do we care about cross-compiling?), but dynamic_instrumentation has the same issue, and I figure whenever it becomes a real problem, we can come up with a general solution and apply it to both at once.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now we're just adding ~/.rustup/toolchains/nightly-2022-02-14-x86_64-unknown-linux-gnu/lib to rpath, and that seems to work (and get rid of needing -L or setting LD_LIBRARY_PATH). Cross-compiling is trickier, though. I think we should ignore cross-compiling for now, and when we get to it, cross-compile only as a cargo subcommand, which should set the right LD_LIBRARY_PATH for us.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what you're describing is a different situation. There are three different times when we care about the sysroot / lib / rustlib dirs:

  1. When building c2rust-analyze / c2rust-instrument, the linker needs the rustlib path in order to link against the various rustc_* libraries
  2. When running c2rust-analyze / c2rust-instrument, the dynamic loader needs the lib path in order to find librustc_driver.so
  3. When compiling another program using c2rust-analyze / c2rust-instrument, the tool needs the rustlib path in order to find libcore/libstd, which are dependencies of the program being compiled.

build.rs calling rustc_private_link solves (1); setting the rpath solves (2); this flag in the readme covers (3).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that's what it's being used for. Doesn't cargo find libstd by itself normally?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rustc can find libstd on its own if everything is installed normally. I haven't checked, but I would guess that cargo (outside -Z build-std mode) just lets rustc find libstd as it usually does. In this case, we don't have working cargo integration for c2rust-analyze at the moment, and our rustc wrapper (the c2rust-analyze binary) is not installed in the usual way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For working cargo integration, which I think we definitely want, see #554 for a better way to do it. We may be able to share some of that code.

./fib
```

The final `./fib` command should print several lines like
```
[CALL] fib(5,)
```
which come from the injected calls to `instrument_support::handle_call`.
21 changes: 21 additions & 0 deletions c2rust-analyze/build.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
use std::env;
use std::path::Path;
use std::process::Command;
use std::str;

fn main() {
let rustc = env::var("RUSTC").unwrap();
// Add the toolchain lib/ directory to `-L`. This fixes the linker error "cannot find
// -lLLVM-13-rust-1.60.0-nightly".
let out = Command::new(&rustc)
.args(&["--print", "sysroot"])
.output().unwrap();
assert!(out.status.success());
let sysroot = Path::new(str::from_utf8(&out.stdout).unwrap().trim_end());
let lib_dir = sysroot.join("lib");
println!("cargo:rustc-link-search={}", lib_dir.display());

let target = env::var("HOST").unwrap();
let target_lib_dir = lib_dir.join("rustlib").join(target).join("lib");
println!("cargo:rustc-env=C2RUST_TARGET_LIB_DIR={}", target_lib_dir.display());
}
139 changes: 139 additions & 0 deletions c2rust-analyze/rename_nll_facts.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
'''
Usage: `python3 rename_nll_facts.py src ref dest`

Renames atoms in `src/*.facts` to match the names used in `ref/*.facts`, then
writes the renamed facts to `dest/`.
'''

import ast
from collections import defaultdict
import os
import sys

src_dir, ref_dir, dest_dir = sys.argv[1:]

# Map `src` loan/origin/path names to `ref` loan/origin/path names. We don't
# break this down by type because the names for each type don't collide anyway.
name_map = {}
# Set of `ref` names that appear as values in `name_map`.
ref_names_seen = set()

def match_name(src_name, ref_name):
if src_name in name_map:
old_ref_name = name_map[src_name]
if ref_name != old_ref_name:
print('error: %r matches both %r and %r' % (
src_name, old_ref_name, ref_name))
return
else:
if ref_name in ref_names_seen:
print('error: %r matches %r, but %r is already used' % (
src_name, ref_name, ref_name))
return
name_map[src_name] = ref_name
ref_names_seen.add(ref_name)

def match_loan(src_name, ref_name):
match_name(src_name, ref_name)

def match_origin(src_name, ref_name):
match_name(src_name, ref_name)

def match_path(src_name, ref_name):
match_name(src_name, ref_name)


def load(name):
with open(os.path.join(src_dir, name + '.facts')) as f:
src_rows = [[ast.literal_eval(s) for s in line.strip().split('\t')]
for line in f]
with open(os.path.join(ref_dir, name + '.facts')) as f:
ref_rows = [[ast.literal_eval(s) for s in line.strip().split('\t')]
for line in f]
return src_rows, ref_rows


# Match up paths using `path_is_var` and `path_assigned_at_base`.

def match_path_is_var():
src, ref = load('path_is_var')
ref_dct = {var: path for path, var in ref}
for path, var in src:
if var not in ref_dct:
continue
match_path(path, ref_dct[var])

match_path_is_var()

def match_path_assigned_at_base():
src, ref = load('path_assigned_at_base')
ref_dct = {point: path for path, point in ref}
for path, point in src:
if point not in ref_dct:
continue
match_path(path, ref_dct[point])

match_path_assigned_at_base()

# Match up origins and loans using `loan_issued_at`

def match_loan_issued_at():
src, ref = load('loan_issued_at')
ref_dct = {point: (origin, loan) for origin, loan, point in ref}
for origin, loan, point in src:
if point not in ref_dct:
continue
match_origin(origin, ref_dct[point][0])
match_origin(loan, ref_dct[point][1])

match_loan_issued_at()

# Match up origins using `use_of_var_derefs_origin`

def match_use_of_var_derefs_origin():
src, ref = load('use_of_var_derefs_origin')
src_dct = defaultdict(list)
for var, origin in src:
src_dct[var].append(origin)
ref_dct = defaultdict(list)
for var, origin in ref:
ref_dct[var].append(origin)
for var in set(src_dct.keys()) & set(ref_dct.keys()):
src_origins = src_dct[var]
ref_origins = ref_dct[var]
if len(src_origins) != len(ref_origins):
print('error: var %r has %d origins in src but %d in ref' % (
var, len(src_origins), len(ref_origins)))
continue
for src_origin, ref_origin in zip(src_origins, ref_origins):
match_origin(src_origin, ref_origin)

match_use_of_var_derefs_origin()


# Rewrite `src` using the collected name mappings.

os.makedirs(dest_dir, exist_ok=True)
for name in os.listdir(src_dir):
if name.startswith('.') or not name.endswith('.facts'):
continue

with open(os.path.join(src_dir, name)) as src, \
open(os.path.join(dest_dir, name), 'w') as dest:
for line in src:
src_parts = [ast.literal_eval(s) for s in line.strip().split('\t')]
dest_parts = []
for part in src_parts:
if part.startswith('_') or part.startswith('Start') or part.startswith('Mid'):
dest_parts.append(part)
continue

dest_part = name_map.get(part)
if dest_part is None:
print('error: no mapping for %r (used in %s: %r)' % (
part, name, src_parts))
dest_part = 'OLD:' + part
dest_parts.append(dest_part)

dest.write('\t'.join('"%s"' % part for part in dest_parts) + '\n')

1 change: 1 addition & 0 deletions c2rust-analyze/rust-toolchain
198 changes: 198 additions & 0 deletions c2rust-analyze/src/borrowck/atoms.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,198 @@
use std::collections::hash_map::{HashMap, Entry};
use std::hash::Hash;
use polonius_engine::{self, Atom, FactTypes};
use rustc_middle::mir::{BasicBlock, Local, Location, Place, PlaceElem};
use rustc_middle::ty::TyCtxt;

macro_rules! define_atom_type {
($Atom:ident) => {
#[derive(Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Debug, Hash)]
pub struct $Atom(usize);

impl From<usize> for $Atom {
fn from(x: usize) -> $Atom {
$Atom(x)
}
}

impl From<$Atom> for usize {
fn from(x: $Atom) -> usize {
x.0
}
}

impl Atom for $Atom {
fn index(self) -> usize { self.0 }
}
};
}

define_atom_type!(Origin);
define_atom_type!(Loan);
define_atom_type!(Point);
define_atom_type!(Variable);
define_atom_type!(Path);


#[derive(Clone, Copy, Debug, Default)]
pub struct AnalysisFactTypes;
impl FactTypes for AnalysisFactTypes {
type Origin = Origin;
type Loan = Loan;
type Point = Point;
type Variable = Variable;
type Path = Path;
}

pub type AllFacts = polonius_engine::AllFacts<AnalysisFactTypes>;
pub type Output = polonius_engine::Output<AnalysisFactTypes>;



#[derive(Clone, Debug)]
struct AtomMap<T, A> {
atom_to_thing: Vec<T>,
thing_to_atom: HashMap<T, A>,
}

impl<T, A> Default for AtomMap<T, A> {
fn default() -> AtomMap<T, A> {
AtomMap {
atom_to_thing: Vec::new(),
thing_to_atom: HashMap::new(),
}
}
}

impl<T: Hash + Eq + Clone, A: Atom> AtomMap<T, A> {
pub fn new() -> AtomMap<T, A> {
AtomMap {
atom_to_thing: Vec::new(),
thing_to_atom: HashMap::new(),
}
}

pub fn add(&mut self, x: T) -> A {
match self.thing_to_atom.entry(x.clone()) {
Entry::Occupied(e) => {
*e.get()
},
Entry::Vacant(e) => {
let atom = A::from(self.atom_to_thing.len());
self.atom_to_thing.push(x);
e.insert(atom);
atom
},
}
}

pub fn add_new(&mut self, x: T) -> (A, bool) {
match self.thing_to_atom.entry(x.clone()) {
Entry::Occupied(e) => {
(*e.get(), false)
},
Entry::Vacant(e) => {
let atom = A::from(self.atom_to_thing.len());
self.atom_to_thing.push(x);
e.insert(atom);
(atom, true)
},
}
}

pub fn get(&self, x: A) -> T {
self.atom_to_thing[x.into()].clone()
}
}


#[derive(Clone, Copy, PartialEq, Eq, Debug, Hash)]
pub enum SubPoint {
Start,
Mid,
}


#[derive(Clone, Debug, Default)]
pub struct AtomMaps<'tcx> {
next_origin: usize,
next_loan: usize,
point: AtomMap<(BasicBlock, usize, SubPoint), Point>,
path: AtomMap<(Local, &'tcx [PlaceElem<'tcx>]), Path>,
}

impl<'tcx> AtomMaps<'tcx> {
pub fn origin(&mut self) -> Origin {
let idx = self.next_origin;
self.next_origin += 1;
Origin(idx)
}

pub fn loan(&mut self) -> Loan {
let idx = self.next_loan;
self.next_loan += 1;
Loan(idx)
}

pub fn point(&mut self, bb: BasicBlock, idx: usize, sub: SubPoint) -> Point {
self.point.add((bb, idx, sub))
}

pub fn point_mid_location(&mut self, loc: Location) -> Point {
self.point(loc.block, loc.statement_index, SubPoint::Mid)
}

pub fn get_point(&self, x: Point) -> (BasicBlock, usize, SubPoint) {
self.point.get(x)
}

pub fn get_point_location(&self, x: Point) -> Location {
let (block, statement_index, _) = self.get_point(x);
Location { block, statement_index }
}

pub fn variable(&mut self, l: Local) -> Variable {
Variable(l.as_usize())
}

pub fn get_variable(&self, x: Variable) -> Local {
Local::from_usize(x.0)
}

pub fn path(&mut self, facts: &mut AllFacts, place: Place<'tcx>) -> Path {
self.path_slice(facts, place.local, place.projection)
}

fn path_slice(
&mut self,
facts: &mut AllFacts,
local: Local,
projection: &'tcx [PlaceElem<'tcx>],
) -> Path {
let (path, new) = self.path.add_new((local, projection));
if new {
if projection.len() == 0 {
let var = self.variable(local);
facts.path_is_var.push((path, var));
} else {
let parent = self.path_slice(facts, local, &projection[.. projection.len() - 1]);
// TODO: check ordering of arguments here
facts.child_path.push((parent, path));
}
}
path
}

pub fn get_path(&self, tcx: TyCtxt<'tcx>, x: Path) -> Place<'tcx> {
let (local, projection) = self.path.get(x);
let projection = tcx.intern_place_elems(projection);
Place { local, projection }
}

pub fn get_path_projection(&self, tcx: TyCtxt<'tcx>, x: Path) -> &'tcx [PlaceElem<'tcx>] {
let (local, projection) = self.path.get(x);
projection
}
}


Loading