You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.rst
+8-8Lines changed: 8 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,7 +15,7 @@
15
15
Security auditing and static code analysis
16
16
=================================================
17
17
18
-
Aura is a static analysis framework developed as a response to ever increasing threat of malicious packages and vulnerable code published on PyPI.
18
+
Aura is a static analysis framework developed as a response to the ever-increasing threat of malicious packages and vulnerable code published on PyPI.
19
19
20
20
21
21
Project goals:
@@ -28,20 +28,20 @@ Project goals:
28
28
Why Aura?
29
29
---------
30
30
31
-
While there are other tools with a functionality that overlaps with Aura such as Bandit, dlint, semgrep etc. the focus of these alternatives is different which impacts functionality and how they are being used. These alternatives are mainly intended to be used in a similar way to linters, integrated into IDEs, frequently run during the development which makes it important to **minimize false positives** and reporting with clear **actionable** explanations in ideal cases.
31
+
While there are other tools with functionality that overlaps with Aura such as Bandit, dlint, semgrep etc. the focus of these alternatives is different which impacts the functionality and how they are being used. These alternatives are mainly intended to be used in a similar way to linters, integrated into IDEs, frequently run during the development which makes it important to **minimize false positives** and reporting with clear **actionable** explanations in ideal cases.
32
32
33
-
Aura on the other hand reports on **behaviour of the code**, **anomalies** and **vulnerabilities** with as much information as possible at the cost of false positive. There are a lot of things reported by aura that are not necessarily actionable by a user but they tell you a lot about the behaviour of the code such as doing network communication, accessing sensitive files or using mechanisms associated with obfuscation indicating a possible malicious code. By collecting this kind of data and aggregating it together, Aura can be compared in functionality to other security systems such as antivirus, IDS or firewalls that are essentially doing the same analysis but on a different kind of data (network communication, running processes etc).
33
+
Aura on the other hand reports on ** behavior of the code**, **anomalies**, and **vulnerabilities** with as much information as possible at the cost of false positive. There are a lot of things reported by aura that are not necessarily actionable by a user but they tell you a lot about the behavior of the code such as doing network communication, accessing sensitive files, or using mechanisms associated with obfuscation indicating a possible malicious code. By collecting this kind of data and aggregating it together, Aura can be compared in functionality to other security systems such as antivirus, IDS, or firewalls that are essentially doing the same analysis but on a different kind of data (network communication, running processes, etc).
34
34
35
35
Here is a quick overview of differences between Aura and other similar linters and SAST tools:
36
36
37
37
- **input data**:
38
38
- **Other SAST tools** - usually restricted to only python (target) source code and python version under which the tool is installed.
39
-
- **Aura** can analyze both binary (or nonpython code) and python source code as well. Able to analyze a mixture of python code compatible with different python versions (py2k & py3k) using **the same Aura installation**.
39
+
- **Aura** can analyze both binary (or non-python code) and python source code as well. Able to analyze a mixture of python code compatible with different python versions (py2k & py3k) using **the same Aura installation**.
40
40
- **reporting**:
41
-
- **Other SAST tools** - Aims at integrating well with other systems such as IDEs, CI systems with actionable results while trying to minimize false positives to prevent overwhelming users with too much non-significant alerts.
42
-
- **Aura** - reports as much information as possible that is not immediately actionable such as behavioral and anomaly analysis. Output format is designed for easy machine processing and aggregation rather then human readable.
41
+
- **Other SAST tools** - Aims at integrating well with other systems such as IDEs, CI systems with actionable results while trying to minimize false positives to prevent overwhelming users with too many non-significant alerts.
42
+
- **Aura** - reports as much information as possible that is not immediately actionable such as behavioral and anomaly analysis. The output format is designed for easy machine processing and aggregation rather than human readable.
43
43
- **configuration**:
44
-
- **Other SAST tools** - The tools is fine-tuned to the target project by customizing the signatures to target specific technologies used by the target project. Overriding configuration is often possible by inserting comments inside the source code such as ``# nosec`` that will suppress the alert at that position
44
+
- **Other SAST tools** - The tools are fine-tuned to the target project by customizing the signatures to target specific technologies used by the target project. The overriding configuration is often possible by inserting comments inside the source code such as ``# nosec`` that will suppress the alert at that position
45
45
- **Aura** - it is expected that there is little to no knowledge in advance about the technologies used by code that is being scanned such as auditing a new python package for approval to be used as a dependency in a project. In most cases, it is not even possible to modify the scanned source code such as using comments to indicate to linter or aura to skip detection at that location because it is scanning a copy of that code that is hosted at some remote location.
46
46
47
47
@@ -62,7 +62,7 @@ Running Aura
62
62
63
63
docker run -ti --rm sourcecodeai/aura:dev scan pypi://requests -v
64
64
65
-
Aura uses a socalled URIs to identify the protocol and location to scan, if no protocol is used, the scan argument is treated as a path to the file or directory on a local system.
65
+
Aura uses a so-called URIs to identify the protocol and location to scan, if no protocol is used, the scan argument is treated as a path to the file or directory on a local system.
Copy file name to clipboardExpand all lines: docs/source/analyzers.rst
+8-8Lines changed: 8 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,31 +6,31 @@ Aura ships by default with a huge amount of built-in analyzers. To find which an
6
6
Technical description
7
7
=====================
8
8
9
-
Analyzers are developed as hooks that take input data for processing and output either detection result or a ScanLocation for Aura to scan. There are two major types of analyzers. The first one is a classic "normal" analyzer that receives as input a file/directory path with metadata and performs an analysis. This way any kind of file can be processed including non-source code (Python) files. Second type of analyzer is called visitor. It takes an already parsed source code as an input (AST tree) and performs tree traversal, detections and modifications of this tree. A visitor analyzer can modify the tree and such visitors can be chained together which is a core part of a static analysis functionality. A visitor workflow on top of a Python source code is as following:
9
+
Analyzers are developed as hooks that take input data for processing and output either detection result or a ScanLocation for Aura to scan. There are two major types of analyzers. The first one is a classic "normal" analyzer that receives as input a file/directory path with metadata and performs an analysis. This way any kind of file can be processed including non-source code (Python) files. The second type of analyzer is called a visitor. It takes an already parsed source code as an input (AST tree) and performs tree traversal, detections, and modifications of this tree. A visitor analyzer can modify the tree and such visitors can be chained together which is a core part of static analysis functionality. A visitor workflow on top of a Python source code is as follows:
10
10
11
-
- Convert: converts a raw json (parsed ast) into internal representation of nodes that aura uses for further analysis.
12
-
- Rewrite: rewrites the AST tree into while retaining it's semantic equivalent. This is done by applying rules such as constant propagation, string concatenation etc... that removes an unnecessary complexity from the AST tree.
11
+
- Convert: converts a raw JSON (parsed ast) into an internal representation of nodes that aura uses for further analysis.
12
+
- Rewrite: rewrites the AST tree while retaining its semantic equivalent. This is done by applying rules such as constant propagation, string concatenation, etc... that remove unnecessary complexity from the AST tree.
13
13
- Taint Analysis: performs taint analysis using defined semantic rules.
14
-
- Read Only: runs all readonly node visitors, see description below.
14
+
- Read Only: runs all read-only node visitors, see description below.
15
15
16
-
Readonly visitors are a special type of visitors that as the name suggest are prohibited doing any kind of modifications to the tree. This is where the majority of detections that produce results are happening. Since these analyzers are readonly, Aura can run them in parallel on each visited node instead of doing a separate tree traversal for each of the analyzers. This provides a massive performance boost and it is highly recommended to always code AST node analyzers as readonly visitors.
16
+
Read-only visitors are a special type of visitors that as the name suggests are prohibited from doing any kind of modifications to the tree. This is where the majority of detections that produce results are happening. Since these analyzers are read-only, Aura can run them in parallel on each visited node instead of doing a separate tree traversal for each of the analyzers. This provides a massive performance boost and it is highly recommended to always code AST node analyzers as read-only visitors.
17
17
18
18
19
19
ScanLocation is a special type of an item that points to either a directory or a file and tells aura to scan it using enabled analyzers. A common use case for outputting a ScanLocation is when the analyzer itself for example unpacks a zip file and want to process the extracted files in a recursive way
20
20
21
-
Detection result is a standard way to produce an information/result that is by the end of the analysis reported back to the user or serialized into output format.
21
+
The detection result is a standard way to produce an information/result that is by the end of the analysis reported back to the user or serialized into the output format.
Copy file name to clipboardExpand all lines: docs/source/apip.rst
+3-1Lines changed: 3 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,4 +2,6 @@ apip
2
2
====
3
3
4
4
Aura contains an experimental wrapper around the ``pip`` command that would intercept any package installation and sends it to Aura for analysis.
5
-
This wrapper is available under the `<project root>/aura/apip.py` and can be copy/pasted into your bin directory. this is done automatically in case the framework is installed via poetry. apip requires to have the ``AURA_PATH`` environment variable set to point to the aura installation, e.g. the `aura` command, which you can find by running ``which aura`` in your shell. Usage of apip is exactly the same as using the pip command, it proxies everything behind the scenes to the pip script and monkey patch the pip installation to allow intercepting of the package installation.
5
+
This wrapper is available under the `<project root>/aura/apip.py` and can be copy/pasted into your bin directory. this is done automatically in case the framework is installed via poetry. apip requires having the ``AURA_PATH`` environment variable set to point to the aura installation, e.g. the `aura` command, which you can find by running ``which aura`` in your shell. Usage of apip is exactly the same as using the pip command, it proxies everything behind the scenes to the pip script and monkey patch the pip installation to allow intercepting of the package installation.
6
+
7
+
As pip itself does not provide any standard mechanism to hook into package installation, the ``apip`` is using a monkey patching technique to modify existing pip structures to be able to intercept package installations. We are trying to push for a native functionality using this GitHub issue ticket: https://github.com/pypa/pip/issues/8938 .
0 commit comments