
Elegant and Scalable Code Querying with Code Property Graphs
A Talk by Fabian Yamaguchi (Chief Scientist, Managing Director, ShiftLeft GmbH)
About this Talk
Programming is an unforgiving art form in which even minor flaws can cause rockets to explode, data to be stolen, and systems to be compromised. Today, a system tasked to automatically identify these flaws not only faces the intrinsic difficulties and theoretical limits of the task itself, it must also account for the many different forms in which programs can be formulated and account for the awe-inspiring speed at which developers push new code into CI/CD pipelines. So much code, so little time.
The code property graph – a multi-layered graph representation of code that captures properties of code across different abstractions – (application code, libraries and frameworks) – has been developed over the last six years to provide a foundation for the challenging problem of identifying flaws in program code at scale, whether it is high-level dynamically-typed Javascript, statically-typed Scala in its bytecode form, the syntax trees generated by Roslyn C# compiler, or the bitcode that flows through LLVM.
Based on this graph, we define a common query language based on formal code property graph specification to elegantly analyze code regardless of the source language. Paired with the formulation of a state-of-the-art data flow tracker based on code property graphs, we arrive at a distributed cloud native powerful code analysis. This talk provides an introduction to the technology.