
Potential project list
======================

This is a list of projects that are interesting for potential contributors
who are seriously interested in the PyPy project. They mostly share common
patterns - they're mid-to-large in size, they're usually well defined as
a standalone projects and they're not being actively worked on. For small
projects that you might want to work on, it's much better to either look
at the `issue tracker`_, pop up on #pypy on irc.freenode.net or write to the
`mailing list`_. This is simply for the reason that small possible projects
tend to change very rapidly.

This list is mostly for having on overview on potential projects. This list is
by definition not exhaustive and we're pleased if people come up with their
own improvement ideas. In any case, if you feel like working on some of those
projects, or anything else in PyPy, pop up on IRC or write to us on the
`mailing list`_.

Make big integers faster
-------------------------

PyPy's implementation of the Python ``long`` type is slower than CPython's.
Find out why and optimize them.

Make bytearray type fast
------------------------

PyPy's bytearray type is very inefficient. It would be an interesting
task to look into possible optimizations on this.

Numpy improvements
------------------

The numpy is rapidly progressing in pypy, so feel free to come to IRC and
ask for proposed topic. A not necesarilly up-to-date `list of topics`_
is also available.

.. _`list of topics`: https://bitbucket.org/pypy/extradoc/src/extradoc/planning/micronumpy.txt

Improving the jitviewer
------------------------

Analyzing performance of applications is always tricky. We have various
tools, for example a `jitviewer`_ that help us analyze performance.

The jitviewer shows the code generated by the PyPy JIT in a hierarchical way,
as shown by the screenshot below:

  - at the bottom level, it shows the Python source code of the compiled loops

  - for each source code line, it shows the corresponding Python bytecode

  - for each opcode, it shows the corresponding jit operations, which are the
    ones actually sent to the backend for compiling (such as ``i15 = i10 <
    2000`` in the example)

.. image:: image/jitviewer.png

The jitviewer is a web application based on flask and jinja2 (and jQuery on
the client): if you have great web developing skills and want to help PyPy,
this is an ideal task to get started, because it does not require any deep
knowledge of the internals.

Optimized Unicode Representation
--------------------------------

CPython 3.3 will use an `optimized unicode representation`_ which switches between
different ways to represent a unicode string, depending on whether the string
fits into ASCII, has only two-byte characters or needs four-byte characters.

The actual details would be rather differen in PyPy, but we would like to have
the same optimization implemented.

.. _`optimized unicode representation`: http://www.python.org/dev/peps/pep-0393/

Translation Toolchain
---------------------

* Incremental or distributed translation.

* Allow separate compilation of extension modules.

Work on some of other languages
-------------------------------

There are various languages implemented using the RPython translation toolchain.
One of the most interesting is the `JavaScript implementation`_, but there
are others like scheme or prolog. An interesting project would be to improve
the jittability of those or to experiment with various optimizations.

Various GCs
-----------

PyPy has pluggable garbage collection policy. This means that various garbage
collectors can be written for specialized purposes, or even various
experiments can be done for the general purpose. Examples

* An incremental garbage collector that has specified maximal pause times,
  crucial for games

* A garbage collector that compact memory better for mobile devices

* A concurrent garbage collector (a lot of work)

Remove the GIL
--------------

This is a major task that requires lots of thinking. However, few subprojects
can be potentially specified, unless a better plan can be thought out:

* A thread-aware garbage collector

* Better RPython primitives for dealing with concurrency

* JIT passes to remove locks on objects

* (maybe) implement locking in Python interpreter

* alternatively, look at Software Transactional Memory

Introduce new benchmarks
------------------------

We're usually happy to introduce new benchmarks. Please consult us
before, but in general something that's real-world python code
and is not already represented is welcome. We need at least a standalone
script that can run without parameters. Example ideas (benchmarks need
to be got from them!):

* `hg`

* `sympy`

Experiment (again) with LLVM backend for RPython compilation
------------------------------------------------------------

We already tried working with LLVM and at the time, LLVM was not mature enough
for our needs. It's possible that this has changed, reviving the LLVM backend
(or writing new from scratch) for static compilation would be a good project.

(On the other hand, just generating C code and using clang might be enough.
The issue with that is the so-called "asmgcc GC root finder", which has tons
of issues of this own.  In my opinion (arigo), it would be definitely a
better project to try to optimize the alternative, the "shadowstack" GC root
finder, which is nicely portable.  So far it gives a pypy that is around
7% slower.)

Embedding PyPy
----------------------------------------

Being able to embed PyPy, say with its own limited C API, would be
useful.  But here is the most interesting variant, straight from
EuroPython live discussion :-)  We can have a generic "libpypy.so" that
can be used as a placeholder dynamic library, and when it gets loaded,
it runs a .py module that installs (via ctypes) the interface it wants
exported.  This would give us a one-size-fits-all generic .so file to be
imported by any application that wants to load .so files :-)


.. _`issue tracker`: http://bugs.pypy.org
.. _`mailing list`: http://mail.python.org/mailman/listinfo/pypy-dev
.. _`jitviewer`: http://bitbucket.org/pypy/jitviewer
.. _`JavaScript implementation`: https://bitbucket.org/pypy/lang-js/overview
