Mujib Ishola

logo
Hack - Code Generator

Code Generator

Code generators have been around for a very long time in software and are everywhere. From scaffolding generation,to REST API generation, there are a lot of useful code generation tools available.

When people think of code generation, they often think of the automated and therefore quick generation of code that is already implicitly quality assured.

In this article we explore possibilities provided by code generation to give you an overview on the subject.

What is a Code Generator?

A code generator is a tool or resource that generates a particular sort of code or computer programming language.

Code Generation can be used for small portions of code or entire applications. It can be used with widely different programming languages and using different techniques.

There are many ways code can be generated:

  • We can generate repetitive code from schemas or source of information we have. i.e., Data Access Objects from database schema files
  • We can generate code from wizards
  • We can generate skeletons of applications from simple models.
  • We can generate entire applications from high level DSL or IDL
  • We can generate code from information we obtained processing existing documents
  • We can generate code from information we obtained by reverse-engineering code written using other languages or frameworks
  • We can generate code, using programming languages with powerful metaprogramming features
  • Some IDEs have the functionality to generate boilerplate code, like the equals or hashCode methods in Java

Why to Use Code Generation

The reason to use code generation are fundamentally four: consistency ,productivity, portability, and simplification.

Consistency

With code generation you get always the code you expect. The generated code is designed according to the same principles, the naming rule match, etc. The code always works the way you expect, of course except in the case of bugs in the generator. The quality of the code is consistent. With code written manually instead you can have different developers use different styles and occasionally introduce errors even in the most repetitive code.

Productivity

With code generation you write the generator once and it can be reused as many times as you need. Providing the specific inputs to the generator and invoke it is significantly faster than writing the code manually, therefore code generation permits to save time.

Portability

Once you have a process to generate code for a certain language or framework you can simply change the generator and target a different language or framework. You can also target multiple platforms at once. For example, with a parser generator you can get a parser in C#, Java and C++. Another example: you might write a UML diagram and use code generation to create both a skeleton class in C# and the SQL code to create a database for MySQL. So the same abstract description can be used to generate different kinds of artifacts.

Simplification

With code generation you generate your code from some  abstract description. It means that your source of truth becomes that description, not the code. That description is typically easier to analyze and check compared with the whole generated code.

Why Not to Use Code Generation

As all tools code generation is not perfect, it has mainly two issues: maintenance and complexity.

Maintenance

When you use a code generator tool your code becomes dependent on it. A code generator tool must be maintained. If you created it you have to keep updating it, if you are just using an existing one you have to hope that somebody keep maintaining it or you have to take over yourself. So the advantages of code generation are not free. This is especially risky if you do not have or cannot find the right competencies to work on the generator.

Complexity

Code generated automatically tend to be more complex than code written by hand. Sometimes it has to do with glue code, needed to link different parts together, or the fact that the generator supports more use cases than the one you need. In this second case the generated code can do more than what you want, but this is not necessarily an advantage. Generated code is also surely less optimized than the one you can write by hand. Sometimes the difference is small and not significant, but if your application need to squeeze every bit of performance code generation might not be optimal for you.

How Can We Use Code Generation?

Depending on the context code generation can just be a useful productivity boost or a vital component of your development process. An example of an useful use is in many modern IDEs: they allows to create a skeleton class to implement interfaces or similar things, with the click of a button. You could definitely write that code yourself, you would just waste some time performing a menial task.

There are many possible ways to design a code generation pipeline. Basically we need to define two elements:

  1. Input: where the information to be used in code generation comes from
  2. Output: how the generated code will be obtained

Optionally, you may have transformation steps in between the input and the output. They could be useful to simplify the output layer and to make the input and the output more independent.

Possible inputs:

  • A DSL/IDL: for example, we can use ANTLR to describe the grammar of a language. From that we can generate a parser
  • code in other formats: database schemas. From a database schema we can generate DAOs
  • wizards: they permit to ask information to the user
  • reverse engineering: the information can be obtained by processing complex code artifacts
  • data sources like a DB, a CSV file or a spreadsheet

Possible outputs:

  • template engine; most web programmers knows template engines, that are used to fill the HTML UI with data
  • code building APIs: e.g., Javaparser can be used to create Java files programmatically

Let’s now examine some pipelines:

  • parser generation; readers of this website will be surely familiar with ANTLR and other such tools to automatically generate parsers from a formal grammar. In this case the input is a DSL and the output is produced using a template engine
  • model driven design; plugins for IDEs, or standalone IDEs, that allows to describe a model of an application, sometimes with a graphical interface, from which to generate an entire application or just its skeleton
  • database-related code; this use can be considered the child of model driven design and templates engine. Usually the programmer defines a database schema from which entire CRUD apps or just the code to handle the database can be generated. There are also tools that perform the reverse process: from an existing database they create a database schema or code to handle it
  • metaprogramming languages; the groups includes languages that allows near complete manipulation of the code of a program, the source code is just another data structure that can be manipulated
  • ad hoc applications; this category includes everything: from tools designed to handle one thing to ad-hoc systems, used in an enterprise setting, that can generate entire applications from a formal custom description. These applications are generally part of a specific workflow. For example, the customer uses a graphical interface to describe an app and one ad-hoc system generates the database schema to support this app, another one generates a CRUD interface, etc.
  • IDE generated code: many statically typed languages requires a lot of boilerplate code to be written and an IDE can typically generate some of it: classes with stubs for the methods to implement, standard equals, hashCode and toString methods, getters and setters for all existing properties

Code Generation Tools

Having explained a lot about Code generators, let us explore some code generation tools.

  • Yeoman

Yeoman is a generic scaffolding system allowing the creation of any kind of app for new projects that implements all the best practices instantly.

The core of Yeoman is a generator ecosystem, on top of which developers build their own templates. The tool is so popular that there are already thousands of template available.

Yeoman is a JavaScript app, so writing a generator requires simply to write JavaScript code and using the provided API. The workflow is quite simple, too: you ask the user information about the project (e.g., its name), gather configuration information and then generate the project.

  • Bower

Bower offers a generic, unopinionated solution to the problem of front-end package management, while exposing the package dependency model via an API that can be consumed by a more opinionated build stack. There are no system wide dependencies, no dependencies are shared between different apps, and the dependency tree is flat.

Bower runs over Git, and is package-agnostic. A packaged component can be made up of any type of asset, and use any type of transport (e.g., AMD, CommonJS, etc.).

  • Webpack

Webpack is an open-source JavaScript module bundler. It is made primarily for JavaScript, but it can transform front-end assets such as HTML, CSS, and images if the corresponding loaders are included. Webpack takes modules with dependencies and generates static assets representing those modules.

Webpack takes the dependencies and generates a dependency graph allowing web developers to use a modular approach for their web application development purposes. It can be used from the command line, or can be configured using a configuration file which is named webpack.config.js. This file is used to define rules, plugins, etc., for a project. (webpack is highly extensible via rules which allow developers to write custom tasks that they want to perform when bundling files together.)

Node.js is a prerequisite for using Webpack. It provides code on demand using the moniker code splitting. The Technical Committee 39 for ECMAScript is working on standardization of a function that loads additional code: “proposal-dynamic-import”

  • SlushJS

Slush is a scaffolding tool, i.e. a tool to help you generate new project structures to get you up and running with your new project in a matter of seconds. It is a tool in the Front End Scaffolding Toolscategory of a tech stack.

SlushJS is an open source tool with 1.2K GitHub stars and 59 GitHub forks. Here’s a link to SlushJS’s open source repository on GitHub

  • JHipster

JHipster is a free and open-source application generator used to quickly develop modern web applications and Microservices using Angular or React (JavaScript library) and the Spring Framework.

JHipster provides tools to generate a project with a Java stack on the server side (using Spring Boot) and a responsive Web front-end on the client side (with Angular and Bootstrap). It can also create microservice stack with support for Netflix OSS, Docker and Kubernetes.

  • Celerio

Celerio is a Java tool that include a database extractor, to get a database schema from an existing database. Then it couples this generated schema with configuration files and then launch a template engine in order to create a whole application. The extracted database schema is in XML format

Umple

Umple is an example of a tool that combines UML modes with traditional programming language in a structured way. It was born to simplify the process of model-driven development, which traditionally requires specific and complex tools. It is essentially a programming language that supports features of UML (class and state) diagrams to define models. Then the Umple code is transformed by its compiler in a traditional language like Java or PHP.

Umple can be used in more than one way:

  • It can be used to describe UML diagrams in a textual manner
    • It can be used, in combination with a traditional language, as a preprocessor or an extension for that target language. The Umple compiler transform the Umple code in the target language and leaves the existing target language unchanged
    •  Given its large support for UML state machine, it can work as a state machine generator; from a Umple description of a state machine an implementation can be generated in many target languages

Conclusions

Conclusions

 

In this article we have seen a glimpse of the vast world of code generation. We have seen the main categories of generators and a few examples of specific generators. Some of these categories would be familiar to most developers, some will new to most. All of them can be used to increase productivity, provided that you use them right.

There is much more to say about each specific categories.  However we wanted to create an introductory article to introduce to this world without overwhelming the reader with information.

How do you plan to take advantage of code generation?

Leave a Comment

Your email address will not be published. Required fields are marked *