Span in .NET, C# and other languages

By now probably most .NET developers have heard about the new Span classes, that will be added in the upcoming .NET and C# versions (C# 7.2 to be more precise).

I won’t go into details about what it is and why it’s one of the few new features that it’s not just syntactic sugar – the post by Stephen Toub explains it very well.
To summarize, it allows the developer to work with “ranges”, or “slices” defined over array-like data types (array, string but also unmanaged memory buffers), without having to copy the data from that range, without allocating new memory in the heap and accessing it as fast as using an array.
Even if it’s strictly speaking a framework feature (Span, ReadOnlySpan etc…), it’s made possible by several language-level new features.
Why was this necessary in C#? First, doing something similar until now, involved working with pointers and unsafe keyword, or by manually passing arround a reference and the start/end index.
Where is this feature really necessary? Mostly in code that must be highly optimized, that is working with very large arrays (https://github.com/dotnet/corefxlab/blob/master/docs/specs/span.md).

What I wanted to write about is something else – that for a developer, in order to easier understand such new features, in any language or framework, it pays to learn about other languages, to take a look around to what other are doing.

Let’s look at a simple example in C#:

var arr = new byte[10];
for (byte i = 1; i < arr.Length; i++) arr[i] = i;

Console.WriteLine("\nOriginal array:"); // 0 1 2 3 4 5 6 7 8 9
for (int i = 0; i < arr.Length; i++) Console.Write($"{arr[i]} ");

var slice = new Span(arr, 5, 2);
slice[0] = 42;
slice[1] = 43;
Console.WriteLine("\nSpan:");
for (int i = 0; i < slice.Length; i++) Console.Write($"{slice[i]} "); // 42 43

Console.WriteLine("\nOriginal array:"); // 0 1 2 3 4 42 43 7 8 9
for (int i = 0; i < arr.Length; i++) Console.Write($"{arr[i]} ");

and a similar piece of code in Go, where it’s called a slice:

var arr [10]int
for i := 0; i < 10; i++ {
  arr[i] = i
}
fmt.Printf("\nOriginal array: %v", arr) // 0 1 2 3 4 5 6 7 8 9

var slice = arr[5:7]
slice[0] = 42
slice[1] = 43
fmt.Printf("\nSlice: %v", slice) // 42 43

fmt.Printf("\nOriginal array: %v", arr) // 0 1 2 3 4 42 43 7 8 9

In both languages, a span(C#) or a slice (Go) represent a similar concept:
Span: ‘Span is a value type containing a ref and a length‘ , ‘represent contiguous regions of arbitrary memory‘
Slice: ‘A slice is a descriptor for a contiguous segment of an underlying array and provides access to a numbered sequence of elements from that array‘

Are there differences? Probably – .NET spans can be defined also over strings (ReadOnlySpan) or a block of memory allocated on the stack (stackalloc). In .NET, a Span can exists on and point only to objects allocated on the stack, not on the heap (there is Memory for that).
Obviously, a go slice is a built-in language feature, while a .NET Span is a framework feature (that requires suport from the language compiler).

Are there other languages that have the concept of slices? Of course: Fortran, Algol, D, Perl, Python, Ruby etc..
The main difference could be: a slice points back to the original array, or is a copy?

As an example, in Ruby it seems to create a copy:

Interactive ruby ready.
> array = [:peanut, :butter, :and, :jelly]
=> [:peanut, :butter, :and, :jelly]
> slice = array[2,2]
=> [:and, :jelly]
> slice[0] = :lettuce
=> :lettuce
> slice
=> [:lettuce, :jelly]
> array
=> [:peanut, :butter, :and, :jelly]

while in D language, like in C# or Go, it’s pointing to the original elements:

void main()
{
    import std.stdio : writefln;

    int[] arr = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9];
    writefln("Original array: %s\n", arr); // 0 1 2 3 4 5 6 7 8 9
    int[] slice = arr[5..7];
    slice[0] = 42;
    slice[1] = 43;
    writefln("Slice: %s\n", slice); // 42 43
    writefln("Original array: %s\n", arr); // 0, 1, 2, 3, 4, 42, 43, 7, 8, 9
}

The colclusion would be: if somebody make the effort to learn a bit the fundamentals, to look beyond it’s own backyard, wil find it much easier to grasp new concepts, or even to switch to a different platform.

An excellent (but somehow dry) book that took this approach was (in Romanian): ‘Fundamentele limbajelor de programare’ by Bazil Pârv and Alexandru Vancea. Back then (1992) it was more of an academic book, not something that you can use to learn practical programming, but the fundamentals were well illustrated.

Span in .NET, C# and other languages

Leave a comment Cancel reply

Căutare

Categorii

Blogroll

Arhive

Tags

Span in .NET, C# and other languages

Share this:

Leave a comment Cancel reply

Căutare

Categorii

Blogroll

Arhive

Tags