
Computer engineers and programmers have long relied on reverse engineering as a way to copy the functionality of a computer program without directly copying that program’s copyright-protected code. Now, AI coding tools are raising new issues about how the “clean room” rewriting process works legally, ethically, and practically.
Those issues came to a head last week with the release of a new version of Chardate, a popular open source Python library for automatically detecting character encodings. The repository was originally written by coder Mark Pilgrim in 2006 and released under the LGPL license, which places strict limits on its reuse and redistribution.
Dan Blanchard took over maintenance of the repository in 2012 but ran into some controversy with the release of version 7.0 of Chardette last week. Blanchard described that overhaul as a “ground-up, MIT-licensed rewrite” of the entire library, built with the help of Cloud Code, which was “much faster and more accurate” than before.
Speaking to The Register, Blanchard said that he had long wanted to add Charades to the Python standard library, but he had not had time to fix the problems with “its license, its speed, and its accuracy” that were getting in the way of that goal. However, with the help of Cloud Code, Blanchard said he was able to overhaul the library “in about five days” and got a 48x performance boost to boot.
However, not everyone is happy with that outcome. A poster using the name Mark Pilgrim on GitHub came forward to argue that this new version amounts to illegally relicensing Pilgrim’s original code under the more permissive MIT license (which, among other things, allows its use in closed-source projects). As a modification of his original LGPL-licensed code, Pilgrim argues that this new version of Chardet should also retain the same LGPL license.
<a href