GSOC 2018 Ideas

From D Wiki
Revision as of 09:35, 2 October 2019 by Brian (talk | contribs) (gRPC in D)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

This is the D Google Summer of Code page for 2018. If you are interested in participating in the 2018 GSOC as either a student or mentor, and want to do something related to D, please feel free to contact us at (gsoc dot dlang dot io).


Timeline

The timeline for GSoC 2018 can be found here.

Ideas

Plenty of challenging and important projects exist in the D world. They range from writing new or improving existing modules of D's standard library (Phobos), working on its compilers (Compilers), shaping GUI libraries for D, integrating D with other languages and more.

Language Server Protocol for D


The Language Server Protocol (LSP) defines the protocol used between an editor or IDE and a language server that provides language features like auto complete, go to definition, find all references etc. Adding features like auto complete, go to definition, or documentation on hover for a programming language takes significant effort. Traditionally this work had to be repeated for each development tool, as each tool provides different APIs for implementing the same feature (see the list of Editors and IDEs). A Language Server is meant to provide the language-specific smarts and communicate with development tools over a protocol that enables inter-process communication. The idea behind the Language Server Protocol (LSP) is to standardize the protocol for how such servers and development tools communicate. This way, single D Language Server can be re-used in multiple development tools, which in turn can support multiple languages with minimal effort.

Since summer of 2017 DMD is available as a DUB package and can be used for this challenge.

Proposed Project Mentors: Sebastian Wilzbach, Carlborg Jacob Carlborg

Mir Project


The Mir project is developing numerical libraries for the upcoming numeric packages of the Dlang standard library. The are numerous projects an interested student could pursue:

  • University project. Your GSoC project can be combined with your scientific research.
  • ndslice>-<Julia integration
  • ndslice>-<NumPy integration
  • General purpose BetterC libraries
    • I/O betterC Dlang library
    • Async betterC Dlang library
    • String and format betterC library
  • mir-cpuid
    • ARM support
    • Advanced OS specific support
  • mir-glas
    • BLAS Level 2 subprograms
    • One BLAS Level 3 subprogram. It is OK to do only one! But it must be faster then OpenBLAS, and not slower then Intel MKL.
    • Multithreading support for BLAS Level 2
    • Multithreading support for BLAS Level 3

You can get more details on projects related to Mir here.

Its Good To Know:

To work on the Mir you should be proficient with one of:

  • C
  • C++
  • LLVM
  • Fortran
  • Experience with D is not essential.

To work on the Mir project requires a responsible and self-motivated student.

Proposed Project Mentor: Ilya Yaroshenko

ROS2 client


Robot Operating System (ROS) is a robotics middleware (i.e. collection of software frameworks for robot software development). It provides services designed for heterogeneous computer cluster such as hardware abstraction, low-level device control, implementation of commonly used functionality, message-passing between processes, and package management. A ROS2 client package would allow controlling robots in a high-level, modern language without compromising performance (in contrast to popular interpreted languages like Python).

  • functionality and API equivalent to ROS2 C++ client library (C++ client library has most advanced feature set right now)
  • functionality to support building D client library packages

A short technical summary can be found here.

Proposed Project Mentor: TBA

Tabular data container (data frames)


Pandas, R and Julia have made data frames very popular. As D is getting more interest from data scientist (e.g. eBay or AdRoll) it would be very beneficial to use one language for the entire data analysis pipeline - especially considering that D (in contrast to popular languages like Python, R or Julia) - is compiled to native machine code and gets optimized by the sophisticated LLVM backend.

Minimum requirements:

  • conversion to and from CSV
  • multi-indexing
  • column binary operations, e.g. `column1 * column2`
  • group-by on an arbitrary number of columns
  • column/group aggregations

Proposed Project Mentor: TBA

Jupyter notebook D kernel


The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more. Jupyter notebooks are extremely popular among data scientist as they allow interactive progress and exploring a solution step by step. With a bit of work, a statically compiled language can be used in an interactive notebook e.g. [Cling https://blog.jupyter.org/interactive-workflows-for-c-with-jupyter-fe9b54227d92] and Go with gophernotes. Apart from being useful to data scientist, a Jupyter D kernel would allow an intuitive exploration of the language for newcomers and people starting with new libraries.

Existing works: - drepl - simple REPL for D - PydMagic - allows to write PyD extension in a notebook.

Proposed Project Mentor: TBA

ORM a-la SQLAlchemy


SQLAlchemy is an object-relational mapper which provides "a full suite of well known enterprise-level persistence patterns, designed for efficient and high-performing database access, adapted into a simple and Pythonic domain language". While there has been some attempts in implementing an ORM in D - none of these got close to the simplicity and usability of SQLAlchemy (see Database Libraries). However, with D's CTFE queries could - like std.format's strings - already be checked at compile-time and optimizations could be be applied to the parser and serializer.

This project would be based on the proposed std.database abstraction and focus on creating a general-purpose ORM on top of it.

See also: Martin Nowak's DConf16 talk about Object-Relational Mapper

Proposed Project Mentors: TBA

Existing work:

QUIC for Vibe.d

The QUIC protocol (Quick UDP Internet Connections) is an entirely new protocol for the web developed on top of UDP instead of TCP. It's major advantages is the hugely decreased latency. Hence, Vibe.d - D's flagship web application framework - could vastly profit from using the QUIC protocol.

Proposed Project Mentors: Sönke Ludwig

HTTP/2 for Vibe.d

HTTP/2 is a major revision of the HTTP network protocol. Among its advantages belong decreased latency due to (e.g. due to compressed HTTP headers or pipelined requests) or server-push from the server which allows the server to send resources to the client before they are requested. Therefore, Vibe.d - D's flagship web application framework - would vastly profit from implementing HTTP/2.

Proposed Project Mentors: Sönke Ludwig

Existing support HTTP/2 libraries: hunt-http

std.benchmark


Ideally every function in Phobos should have its own benchmark suite which is run for every PR to check for performance improvements or regressions.

std.i18n


Design and implement a basic internationalization framework. It may be possible to implement this with pragma(msg). For proof of concept see http://arsdnet.net/dcode/i18n.d . It should provide at least the following functionality:

  • A locale part, std.i18n.locale which should detect a user's default lanaguage, select a global application lanaguage, and provide types to describe and work with locales.
  • A text translation part, std.i18n.translation, which should be gettext compatible, preferably using the gettext serialized hashtables in .mo files, and provide low level (gettext like) and high level (boost::locale like) APIs.
  • A tool to extract strings which need to be translated from D code. This should preferably be based on DScanner but alternately could use regular expressions. Optionally support for D code could be added to xgettext.

std.parallelism


std.parallelism needs a review and some benchmarking - prior to making improvements. As part of this is would be good to have a standard benchmarking framework, hence the idea of std.benchmark. However there is no need for it to be in std (and hence Phobos) in the first instance. So the project(s) would be to create a comparative benchmarking framework that can then be used to analyse std.parallelism on a more scientific basis than has been done to date.

std.serialization


A flexible (de)serialization framework that can be used as a standardized building block for various serialization related things: std.json, std.csv, Protocol Buffers, Cap'n Proto, vibe.d... One important goal would be to define how a user-specified type has to work in order to make it serializable, as well as to allow the end-user (e.g. a user of std.json) to change the serialization of third-party types that cannot themselves be modified. A good starting point is would be to work with the Orange framework.

std.container


  • support for custom allocator (aka no GC)
  • thread-safe
  • performance
  • existing work: here and here.
  • Additionally lock-free data structures (e.g. based on Martin Nowak's work would be a huge benefit.

std.database


  • Provide a general database interface for D
  • Replace etc.c.sqlite with a native implementation
  • Existing work: here, here, and here
  • See also the following DConf talk and this one and this one.

std.encoding


std.encoding needs a complete overhaul. Other encoding packages as ascii, base64, utf8 could be included and a common API should unite them.

std.eventloop


High-performance native event loop abstraction for D.

Existing work:

Phobos inclusion will require that the event loop abstraction is not baked into a specific purpose as that comes later. The specific events, sources and consumers must be configurable and high performant. Existing solution that is well tested in this manner is glib's event loop. The closest to this design in D would be SPEW.

For this work to succeed the following things must be in Phobos:

  • The abstraction
  • An example implementation of the event loop

Available in the community:

  • Sockets support
  • Windowing support

To simplify matters only Windows support needs to be included for the example sources and consumers of events.

This work would allow D to have a high performance (and correctly performing) windowing, timers and sockets libraries that can integrate should the sources be cross supported.

The biggest domain limitation to this work would be threading. The event loop library should be aware that some sources and consumers may be required to be thread only, global only, or both.

std.graph


Graphs are used in all sorts of contexts - from low-level network communication to high-level data science. At the moment there is only one unmaintained D library, but Boost::graph would be a good direction.

Proposed Project Mentors: TBA

Graphics library for resource constrained embedded systems


Create a 2D rasterizer, rich drawing primitives, and 2D graphics library suitable for resource constrained embedded systems (e.g. ARM Cortex-M) to be used in industrial controls, home appliances, medical devices, consumer electronics, and IoT just to name a few. The end goal would be something similar to Segger's emWin. The library would be used to drive LDCs similar to https://www.adafruit.com/product/3396 Requirements:

  • Hardware agnostic; should simply render to a frame buffer
  • No dependencies (No Phobos, no C standard library, and no official D runtime).
  • Consider using -betterC, but a custom minimal D runtime is also a viable option

Related work:

Proposed Project Mentors: TBA

DUB: D's package manager


DUB - D's package manager - is one of the central infrastructure tools in the D world as a convenient build and package management tool for the D community. With D gaining more and more popularity, it's also a key tool to guarantee D's further adoption. Thus there are many great ideas to improve DUB:

Proposed Project Mentors: Wilzbach, Sönke Ludwig

Improve specification and implementation of the shared and synchronized keywords


The shared and synchronized keywords have a huge potentials to make parallel programming in D work like magic. However, at the moment the specification and implementation greatly differs, which lead to many issues. The expected goals are:

  • Unite the implementation of the shared synchronized keywords with their specification
  • Use shared in std.parallelism and std.concurrency and make them @safe

Functional Reactive Programming


Functional reactive programming (FRP) is a programming paradigm for reactive programming (asynchronous dataflow programming) using the building blocks of functional programming (e.g. map, reduce, filter). The best-known FRP API is the ReactiveX API whose implementation in D this project would focus. Note that this project is closely related to the std.eventloop project.

Proposed Project Mentors: Petar Kirov, TBA

DStep: C++ support


DStep is a tool to generate D bindings from C and Objective-C header files. As more and more people are looking to gradually migrate their project to D, being able to use their existing libraries without any work would lower the transition barrier significantly. In 2016 Wojciech Szęszoł improved DStep by adding among others C preprocessor support. Check out his [report. Related work:

Proposed Project Mentors: Jacob Carlborg

gRPC in D


gRPC is a high performance, open-source universal RPC framework. It works across languages and platforms by automatically generating idiomatic client and server stubs in a variety of languages. The goal of this project is to add D to the supported languages of gRPC and thus allow the use of D in gRPCs stacks.

Existing work:

Proposed Project Mentor: Ali Çehreli, TBA

FlatBuffers Support and/or Improved Protocol Buffer Support


FlatBuffers is an efficient cross platform serialization library for C++, C#, C, Go, Java, JavaScript, PHP, and Python. It was originally created at Google for game development and other performance-critical applications.

Currently there is no support for D, so this project would involve building FlatBuffers support from scratch. The goal of the project is to contribute the D support to the upstream repository.

Regarding Protocol Buffers, existing work has been done to provide support for D, however, there are a number of areas that can be improved including:

  • comments and deprecated fields handling
  • generate interfaces for protobuf services
  • benchmarking

Existing work

Proposed Project Mentors: Dragos Carp

Java Native Interface (JNI) library

D's features allow to write cut down the boilerplate of wrapper drastically. See the recently published excel-d for a comparison between the C++ implementation for registering a function in Excel and the D way. Similarly a solid JNI library would open up new many new use cases and allow the Java community to profit even easier from D's features.

Previous work: DJni, djvm

fork()-based Garbage Collector


Gustavo Rodriguez-Rivera and Vincent Russo authored a paper on an interesting take on a garbage collector: leverage efficient fork() implementations, which in many Unix systems take advantage of hardware-provided copy-on-write semantics for duplicated memory pages across processes.

This idea is already leveraged in the high-performance garbage collector for D implemented and used by Sociomantic. (A lingering issue is fork() and malloc() share a lock, which is problematic.) Leandro Lucarella, the engineer who wrote the implementation, has open sourced it here, but that is a bitrotten version that has fallen on the wayside.

Leandro would be glad to assist with questions by a motivated implementer. Gustavo has quite a few ideas for improvements, including a possible Windows implementation, and may be able to even coauthor a paper.

Who's (Using) Who?


It happens often that executables include code that seems unused (e.g. a typical "hello world" links in functions that are not easily explained). A tool that shows dependency chains would be a great helper in understanding what dependencies are at work, and would give insight into how to reduce them.

The tool would output for each function all symbols it uses. The tool's output would be in one (or more) popular format of stock tools for graph drawing, such as DOT, Graphviz, Sage, PGF/TikZ, newGRAPH, etc.

This can be done using Valgrind's plugin, Callgrind, as explained here.

Proposed Project Mentors: Stefan Rohe

Linux debugger


Project Description

ZeroBUGS is a high-quality, source-level debugger for Linux implemented from first principles in C++ by Cristian Vlasceanu. The author got busy with work and the project has since bitrotten, as did a fork of it by a different engineer.

ZeroBUGS presents amazing opportunities for D/Linux debugging, and Cristian is willing to guide a motivated implementer.

SDC Project - D Compiler as a Library


Project Description

The SDC project (https://github.com/deadalnix/SDC) is an effort to provide a D compiler as a library. Any ideas to further the development of this project are welcome, but for a student who would like a specific project we propose the following

  • Start by implementing with @property feature of D. This feature will allow a D programmer to create functions that are called using the same syntax as variable access.
  • Using the @property feature the student will be able to implement the runtime support for slices and associative arrays. The operations to implement are as follows:
    • Implement arrray operations like concatenation and appending, and implement a sound memory management strategy for the underlying data.
    • Implement a generic and efficient hash table. The data structure and algorithms used must be flexibile enough to be adapted to any type of data that might be stored in the table. A concurrent version of the table is need for shared data.
  • Finally, the student will implement masquerading of D syntax into calls for the runtime.
  • Integrate LLVM's new JIT infrastructure in SDC, the On-Request Compilation JIT (ORCJit) API. This would simplify the implementation of certain D features such as Compile Time Function Evaluation (CTFE) for SDC.

Its Good To Know

  • Please watch Amaury's DConf talk on SDC.
  • SDC is developed in D (of course) so you will need to be proficient in D by the time you start coding.
  • You should have taken at least one course on compilers, or at the least be willing to educate yourself in this regard. There is a decent course availabe through Coursera https://www.coursera.org/course/compilers
  • You should familiarize yourself with classicial data structures for arrays and have knowledge of various schemes for table implementations, (it is worthwhile to read up on hopscotch and robin hood hashing).
  • SDC uses LLVM for code generation, so some familiarity with LLVM will be required (see http://llvm.org/docs/tutorial/index.html).

Proposed Project Mentor: Amaury Sechet

Propose your own project


Do you have your own vision for improving D's success? That's great! Are you convinced that your project helps to advance open source technology related to the D programming language? Then simply get in touch with D's GSoC org admin (gsoc (at) dlang (dot) io) and potential mentors for your project.

If you are searching for inspiration, you might want to have a look at the list of D Improvement Proposals (DIPs), the high-level vision for this semester or the wish list.

Ideas From Previous Years

GSoC idea pages from past years:

Tips for students

Please see the "gsoc" articles at our official blog which contain experience reports from students of previous years.

Daniel Pocock has written a detailed blog about initiatives students can take if they want to have a serious chance of being selected in GSoC without a focus on one specific organization.

To learn more about potential mentors check, see below.

Tips for Mentors

If you are interested in mentoring, please check out the organization administrator and mentor manual for more information.

D Mentors

This section provides brief biographies of mentors for Google Summer of Code Projects for the D Programming Language. If you are looking for information on the creatures from the Harry Potter books, try here.

Andrei Alexandrescu

Andrei is the author of the book "The D Programming Language" (2010), and is the author of the award winning "Modern C++ Design" (2001), and with Herb Sutter, "C++ Coding Standards" (2005). Since 2006 Andrei has collaborated closely with Walter Bright, the creator and driving force behind D, on the design and implementation of the language and its standard library. He currently works with Facebook. Andrei's favorite English words are "No" and "Destroy". You can check out Andrei in action here.

Jacob Carlborg

In his day job, Jacob Carlborg is a Ruby backend developer for Derivco Sweden, but he’s been using D on his own time since 2006. He is the maintainer of numerous open source projects, including DStep, a utility that generates D bindings from C and Objective-C headers, DWT, a port of the Java GUI library SWT, and DVM. He implemented native Thread Local Storage support for DMD on OS X and contributed, along with Michel Fortin, to the integration of Objective-C in D.

Dragos Carp

Dragos started writing embedded software for laser applications. Currently he works in Munich coding in D on passenger information systems. He has an interest in asynchronous programming and communication protocols and ported the asyncio python library to D.

Ali Çehreli

Ali has been working with C, C++, and D in Silicon Valley since 1996. He is the author of Programming in D, and is frequently found in the D Learn forum with ready answers to questions on using the language. He also is an officer of the D Language Foundation.

Richard (Rikki) Andrew Cattermole

Richard is from New Zealand. He recently graduated from Christchurch Polytechnic Institute of Technology with a degree in ICT.
In the D community he is known for DOOGLE (GUI toolkit), Cmsed, Dvorm and more recently Devisualization.
He has a strange fascination towards CTFE and its many uses. For this reason alone Cmsed, has significant CTFE usage in helping optimise routing and database work.

Craig Dillabaugh

Craig is a avid fan of the D programming language since he first came across it seven years ago. It is the first open source project he has contributed too, but since he is not a programming whiz, like the others listed on this page, he has taken a management position. He has a Ph.D. in Computer Science from Carleton University with a specialization in Computational Geometry and External Memory data structures. He lives in Ottawa, Canada where he works as a developer for Solana Networks. In his free time he likes to hang out with his family, and is interested in soccer (football), ice hockey and skiing.

Johan Engelen

Johan is one of the core LDC developers. He is also a D compiler consultant for Weka.io (probably the largest industrial single-executable D codebase) and maintains and adds features to their fork of LDC. While working on LDC, he often studies the compiler output IR and assembly. To aid this, he helped setting up LDC at d.godbolt.org. Examples of his recent contributions to LDC related to optimization are @fastmath, link-time optimization (LTO) and profile-guided optimization (PGO).

Johan (PhD Electrical Engineering) is assistant professor in the Robotics and Mechatronics (RAM) group at the University of Twente, the Netherlands. In his spare time he has contributed large amounts of work to the open source projects LDC and Inkscape.

Petar Kirov

Petar is an advocate of functional programming in D and regular contributor to - among others - Phobos and the DTour.

Sönke Ludwig

Sönke started in the D community as the author of Vibe.d and DUB - D's package manager. In his other free time, he also maintains Ddox - an advanced D documentation engine.

Jonathan Marler

[Jonathan https://github.com/marler8997] is a software developer at HP. In his free time he contributes to dmd and e.g. recently implemented the automatic compilation of imports.

Martin Nowak

After working 2 years as C++ application and DSP developer at Ableton and almost 2 years of full-time open source work, Martin is currently working as backend engineer at Mobisol. He is one of the main contributors to the D runtime and reference compiler, and also D's release manager.

Dmitry Olshansky

Dmitry Olshansky is a all-around researcher and software engineer. He's been a long-time D language contributor with the most notable contribution being std.regex and std.uni modules of the standard library. Aside from everything D related his main interests are compilers, text processing, robotics, parallel and concurrent programming, scalable network systems and AI.

Stefan Rohe

Stefan is officially the first commercial D programmer worldwide. He is a fan of static code analysis and the author of AnalyzeD. At Funkwerk he introduced this programming language, so that D today is deployed widely in public transport all over Europe.

Steven Schveighoffer

Steven has been using D since 2007, is part of the core druntime and Phobos teams, and has written several D libraries. His contributions, aside from the iopipe library, include a container library (dcollections), rewriting the array runtime, and is the original proposer of the inout type modifier. He has been working on systems ranging from embedded controllers to high-end distributed systems since graduating from WPI. He currently works for National Resource Management in Massachusetts writing internal systems (some using D), and is the organizer for the Boston D language group.

Amaury Sechet

Amaury, or Deadalnix as he is known in the D community, is an engineer with Facebook, and the creator and lead developer on SDC.

Sebastian Wilzbach

Sebastian is a former GSoC student for D Language Foundation (2016) and a regular contributor to Phobos. In his other life he studies computational biology in Munich and works as an IT consultant.

Ilya Yaroshenko

Ilya is an IT consultant with a background in statistics. He has experience in distributed high-load services and business process analyses. He is the creator of the Mir library.