C# RoslynAPI SyntaxTree - Variables Are Not Variables (Gotchas)

Introduction

Have you ever wondered what is variable when you write your C# code in terms of syntax? Are you familiar with Roslyn API? If you are planning to create any developer tool for other C# developers, write Visual Studio Extension, or .NET Analyzer, then it's very likely you'll use Roslyn API. The goal of this article isn't cover all basic concepts but to gently introduce you to the world of the latest C# compiler platform.

Below are some questions that can help to verify your assumptions. If you know the answers, feel free to skip the article:

  1. Do you know that event in C# is actually both a field and a variable from the syntax point of view?
  2. Do you know that to get a local method's variable, you should NOT search for VariableDeclarationSyntax?
  3. Do you know the difference between VariableDeclarationSyntax and VariableDeclaratorSyntax?
  4. Are method arguments accept variables from the syntax point of view?

Prerequisites and tools


To demonstrate the concept of C# variables from the syntax point of view, I'll be using the "Roslyn Syntax Visualizer" tool for Visual Studio. You can find more about it here. It's a tool to browse the syntax tree of the active file in a project.

To get started with understanding SyntaxTrees, semantics, and the difference between both, I recommend reading the official documentation:

  1. Work with syntax | Microsoft Docs
  2. Work with semantics | Microsoft Docs

A syntax tree is a tree-like structure created by a compiler during compilation. It represents your code in terms of the SyntaxNode of different kinds, like classes, structs, variables, etc. A root node of the syntax tree is "CompilationUnitSyntax," which represents a file. So, your projects consist of multiple SyntaxTrees, each one representing a CompilationUnitSyntax/File. But it's not always true because you can create an entire program without having any files using RoslynAPI. Anyway, it should be enough to get to the point.

Variables are Not Variables

Exploring the fields

To start, let's consider the following class (you can skim it for now and return to it as you need):

Here we can see two private fields, defined in a single line in a class:

private int age, height;

Can you guess what kind of a syntax node it represents? Something related to variables, right? Not exactly.

A field in the C# syntax world is represented by the FieldDeclarationSyntax node. Below is the output of "Roslyn Code Visualizer" for this line of code:

You can see that it's not a variable. But it has child VariableDeclarationSyntax and VariableDeclaratorSyntax. But does it contain any names? Let's take a look at the properties of this node:

It doesn't seem so. Does the variable declaration contain it?

Another miss. Does VariableDeclaratorSyntax contain it? Almost. It contains IdentifierSyntax, which contains an actual name of the first field:

and another one of the second field:

Exploring the events

What are the events from the syntax point of view? Are those variables?

public event EventHandler<EventArgs> HappinessMaxout, SaddnessMaxout;

Let's take a look at the syntax tree. Here is EventFieldDeclarationSyntax:

Notice, it also contains VariableDeclarationSyntax node, containing multiple VariableDeclarator nodes.

So, answering question #1 at the top of the article - yes, from the syntax point of view, events are both fields and variables.

Exploring local variables

What are the local variables from the syntax point of view? Let's take a look at this line of code:

int mutation1, mutation2;

It also contains VariableDeclarationSyntax but it is LocalDeclarationSyntax:

Fields, Events, Methods Exploration Summary

What does all this mean? First of all, in terms of syntax, the term VariableDeclarationSyntax is ambiguous and can be part of:

  1. Fields
  2. Events
  3. Method's/Expression Local Variables

Secondly, a field, event, and local variables in terms of the syntax are variables. But to be more precise, they contain VariableDeclarationSyntax and VariableDeclaratorSyntax

What is VariableDeclaratorSyntax and why do we need it?

You should already understand the difference between VariableDeclarationSyntax and VariableDeclaratorSyntax.

VariableDeclarationSyntax allows to define multiple variables on a single line:

int a, b;

or on multiple lines:

int a,
    b;

while each variable inside it is represented by VariableDeclaratorSyntax.

When we want to access the first variable name, we can't do it by just having VariableDeclarationSyntax, because it doesn't contain an IdentifierSyntax node (that contains the actual variable name).

But VariableDeclaratorSyntax does contain IdentifierSyntax node with name. Here is code to demonstrate the concept:

variableDeclarationNode.Variables[0].Identifier.TextValue == "a"; // true
variableDeclarationNode.Variables[1].Identifier.TextValue == "b"; // true

Where:

  • Variable[0] = VariableDeclaratorSyntax
  • Identifer = IdentifierSyntax
  • TextValue = the actual name of a variable

Hopefully, it answers questions #2 and #3 at the top of the article.

ParameterSyntax vs ArgumentSyntax


We often use the words "parameter" and "argument" as synonyms. We can also say that we pass a variable to a method. But in RoslynAPI, those are different.

ParameterSyntax is used inside MethodDeclarationSyntax:

ArgumentSyntax is used inside InvocationExpressionSyntax (when calling methods):

and none of those contains VariableDeclarationSyntax, nor VariableDeclaratorSyntax.

So, answering question #4, we can call them "Variables", but can't say for sure that "a method can accept variables" in terms of C# syntax.

Summary

And that's it! To learn those concepts further along with other Roslyn APIs, I recommend writing a simple console application and exploring it with the "Roslyn Syntax Visualizer" extension in Visual Studio.

Also, to find a real-world example, you can check out my dngrep repo (.NET Global Tool) on GitHub. Thanks for reading!