Copydetect

Latest version: v0.5.0

Safety actively analyzes 723650 Python packages for vulnerabilities to keep your Python projects secure.

Page 2 of 3

0.4.1

- Fix: the program now behaves identically for config files and command-line arguments -- in particular, parameters which have defaults (reference directories, noise threshold, extensions) are no longer required to be filled if a config file is being used. They will fall back to defaults just like command-line parameters.

- API Update: the `config` parameter to `CopyDetector` has been deprecated and will be removed in a future version. This could result in ambiguity when parameters were provided both in the `config` dictionary and as optional arguments to the detector. To initialize a `CopyDetector` object using a config dictionary, use the new `CopyDetector.from_config()` function. The change described above also applies to this function -- missing values in the provided dictionary will be filled with defaults where applicable.

- Update: the default `guarantee_threshold` has been updated to be equal to the `noise_threshold` (30 --> 25). This comes with a small performance drop but it is fairly minor and the gap seemed to be causing some confusion.

0.4.0

Fix/feature: the similarity matrix is no longer necessarily square. There will no longer be large gaps when test files != reference files.
Bux fix: similarity is now based on number of fingerprints rather than number of tokens. This improves detection for files with large amounts of duplication (e.g., XML files)
Feature: fp argument for CodeFinerprint: fingerprints can now be initialized with file pointers rather than just a file path.

0.3.0

Improvement: both images and the style sheet have been merged into the output HTML file. Instead of saving an output directory, copydetect outputs a single html file with a name/path controlled using the -o parameter (default: report.html).
Improvement: output report now uses Bootstrap 5.

Bug Fix: changed deprecated jinja2.escape import to markupsafe.escape
Bug Fix: preprocessor directives are now correctly tokenized for languages which use them (6)
Bug Fix: token.Name.Variable, token.Name.Attribute tokens are now treated as variables in addition to token.Name tokens. This improves tokenization for certain languages (primarily Java).

0.2.1

Improvement: the style sheet is now copied to the output folder rather than directly linking to its location in the package data to improve output portability.

0.2.0

New Features:
- The `CopyDetector` object is now much easier to work with using the python API. Detector parameters can be directly passed to the object, and there is a new `add_file` function to add new test/reference/boilerplate files. This is mostly intended for cases where simply adding directories containing test/boilerplate files is not sufficient, so the user would instead want to write their own code to collect files.
- Added a new `force_language` parameter (`-o`, for override) in case the language detected by pygments based on the file extension is not correct
- Added a new `truncate` parameter (`-T`) which will truncate the highlighted output displayed on the report. Any code that is not within 10 lines of code which has been flagged for plagiarism will be replaced with "...". In future versions this option will be switched to a toggle on the html report, so for now it's off by default.

0.1.4

Bug fixes:
- Firefox windows can now find style.css
- The file collector no longer matches directories (directories with file extensions were being matched, crashing the program)
- The detector will no longer attempt to load binary boilerplate files

Page 2 of 3

Releases

Has known vulnerabilities

Previous Next

Copydetect

Page 2 of 3

0.4.1

0.4.0

0.3.0

0.2.1

0.2.0

0.1.4

Page 2 of 3

Links

Releases