pointers in Monkey C - bug or by design?

this code

function f(arr)
   {
       arr[0]=11;
       arr[1]=12;
       arr[2]=13;
   }

   function test()
   {
       var x = [1,2,3];
       SYS.println(x);
       f(x);
       SYS.println(x);

}

produces in console:
[1, 2, 3]
[11, 12, 13]

It means that f modified external var x. Is it by design or bug?

sdk 4.1b, 4.0.6, 4.0.5...
eclipse CIQ plug in: 4.1.0.beta1
eclipse ver: 2021-09 (4.21.0) Build id: 20210910-1417
windows 10
minSdkVersion 2.4.0
java jre1.8.0_311

Top Replies

flowstate over 3 years ago +1

Actually, f() didn't modify x. It modified the contents of what x points to. If you had code in f() such as "arr = [4, 5, 6]", then the contents of x in test() would be unchanged.

This is…
_psx_ over 3 years ago in reply to flowstate +1

ok, this is the nicest "bug" I have ever seen :)

it resolves many troubles
dpawlyk over 3 years ago in reply to flowstate +1

flowstate said:
- In the context of the same compilation unit, x is an array (e.g. sizeof(x) is the size of the array)

- In the context of a different compilation unit, x is a pointer to the first element…

All Replies

0 flowstate over 3 years ago in reply to dpawlyk

dpawlyk said:
The MS link shows that arrays are passed as references in true functions and as reference in lambda functions (I think that's what they are called), the thing with the "=>".

Yes, they are "passed as references" (reference types), not "passed by reference" (which means that the function can modify the parameter itself).

Sorry to post this again:

https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/ref

Don't confuse the concept of passing by reference with the concept of reference types. The two concepts are not the same. A method parameter can be modified by ref regardless of whether it is a value type or a reference type. There is no boxing of a value type when it is passed by reference.

You may not agree with this distinction but other discussions make the same kind of distinction.

0 flowstate over 3 years ago in reply to flowstate

Maybe a different, simpler of phrasing it:

1) "pass as value/reference type": refers to what is being passed in (did we pass in a primitive value, pointer, reference, or entire object?)

2) "pass by value/reference": refers to how it's being passed in (did we pass in the parameter in a way that the parameter can be modified by the function, or not?)

Of course 2) is still ambiguous depending on context.

0 dpawlyk over 3 years ago in reply to flowstate

flowstate said:
Yes, they are "passed as references" (reference types), not "passed by reference" (which means that the function can modify the parameter itself).

Yeah, it's doing the same thing as C does except you can't change the value of the address (basically, it's a read-only value with the address of the first element). C# doesn't use pointers.

The MS link actually shows that both functions change the values of the elements in the array.

0 dpawlyk over 3 years ago in reply to flowstate

flowstate said:
Maybe a different, simpler of phrasing it:

By value => a copy is created. Whatever you do with the copy doesn't change the original.

By/as reference => another name (alias) to one thing.

C doesn't really have "references" (it's a bit of a lie). What are called references in C are just memory address values (that can be used to refer to things stored at the address). In other words, in C, you pass a value that is used to reference (read another value from) memory.

0 flowstate over 3 years ago in reply to dpawlyk

dpawlyk said:
C doesn't really have "references" (it's a bit of a lie). What are called references in C are just memory address values (that can be used to refer to things stored at the address). In other words, in C, you pass a value that is used to reference (read another value from) memory.

I assume you mean C++, since I don't think anyone claims that C has references.

I'll just quote someone who wrote something a lot more precise than I care to:

https://stackoverflow.com/questions/17168623/does-c-even-have-pass-by-reference

The concept of "reference semantics" is an abstract, theoretical one, meaning that you have a way for a function to modify existing data in place by giving that function a reference to the data.

The question is whether reference semantics can be implemented by a language. In C, you can obtain reference semantics by using pointers. That means you can obtain the desired behaviour, but you need to assemble it yourself from various bits and pieces (pointer types, address-of operators, dereference operators, passing pointers as function arguments ("by value")).

By contrast, C++ contains a native reference type and thus implements reference semantics directly in the language.

Other languages have different approaches to this, for example, in Python many things are references by default (and there's a difference, say, between a = b and a = b[:] when b is a list). Whether and how a language provides reference vs value semantics is a profound part of the language's design.

Note that they talk about passing pointers "by value". I would say this is similar to the idea of passing references by value, in C# or JavaScript.

And what we call references in C++, C#, JavaScript and Python aren't all exactly the same thing. Obviously in C++, a reference can point to any type of variable (including a primitive), but not so for JavaScript and Python.

e.g.

https://medium.com/@naveenkarippai/learning-how-references-work-in-javascript-a066a4e15600

I would argue that the concept of a "reference" in Monkey C is lot like the concept of a reference in JavaScript or Python, but less like the concept of a reference in C++.

And when people say "Monkey C is pass-by-value", they mean the same thing that they mean when they say "JavaScript is pass-by-value".

dpawlyk said:
By value => a copy is created. Whatever you do with the copy doesn't change the original.

By/as reference => another name (alias) to one thing.

Again, I'll point out that in some of stackoverflow discussions and the Microsoft page above, they make a very clear distinction between "passing by reference" and "passing a reference type".

Don't confuse the concept of passing by reference with the concept of reference types. The two concepts are not the same. A method parameter can be modified by ref regardless of whether it is a value type or areference type. There is no boxing of a value type when it is passed by reference.

dpawlyk said:
By value => a copy is created. Whatever you do with the copy doesn't change the original.

Right, and that value can be a pointer (C/C#), reference (C#/JavaScript), primitive value or entire object (C). That's where the semantic confusion creeps in.

I think it's also useful to read how others describe passing references to functions in python.

stackoverflow.com/.../how-do-i-pass-a-variable-by-reference

Arguments are passed by assignment. The rationale behind this is twofold:

the parameter passed in is actually a reference to an object (but the reference is passed by value)
some data types are mutable, but others aren't

So:

If you pass a mutable object into a method, the method gets a reference to that same object and you can mutate it to your heart's delight, but if you rebind the reference in the method, the outer scope will know nothing about it, and after you're done, the outer reference will still point at the original object.
If you pass an immutable object to a method, you still can't rebind the outer reference, and you can't even mutate the object.

What stands out here to me again is the concept of passing a reference by value. (Which I've stressed over and over again -- and my understanding is that you believe this is a contradiction in terms).

0 dpawlyk over 3 years ago in reply to flowstate

flowstate said:
What stands out here to me again is the concept of passing a reference by value.

I don't think this clarifies anything.

At the basic level, a reference is a memory address.

If you pass by reference, you are providing a memory address value, which is used to get to some other value or values.

0 dpawlyk over 3 years ago in reply to flowstate

flowstate said:
I assume you mean C++, since I don't think anyone claims that C has references.

int I; // this is a reference. I is a reference to memory with a value. You can change what's stored there (the value of I) but you can't make I refer to some other memory address).

The first answer on stackexchange was clear (it's what I said). The "more precise" thing obscures what is going on.

C is unusual because it lets you treat memory addresses as values you can do operations on.

In most languages (other than C) a reference value (memory address) isn't something you can futz with.

The Python link is saying the same thing but obscuring it with confusing jargon.

References are memory addresses (they are names given to blocks of memory).

C (unusually) lets you use memory addresses as values in some places.

0 flowstate over 3 years ago in reply to dpawlyk

dpawlyk said:
int I; // this is a reference. I is a reference to memory with a value. You can change what's stored there (the value of I) but you can't make I refer to some other memory address).

TL;DR I would prefer to call l in this example an lvalue. (See below).

dpawlyk said:
FlowState said:
What stands out here to me again is the concept of passing a reference by value.

I don't think this clarifies anything.

Maybe it doesn't clarify anything. We could look at a counter-example: "passing a reference type by reference" (in Microsoft's C# parlance).

void foo(ref int[] a, int b[]) {
a = b;
}

void test() {
int[] x = { 1, 2, 3, 4 };
int[] y = { 5, 6, 7, 8 };

foo(x, y); // now x and y refer to same array / memory location: { 5, 6, 7, 8}
}

Or we could pass the same "reference type by value"

void foo2(int[] a, int b[]) {
a = b;
}

void test() {
int[] x = { 1, 2, 3, 4 };
int[] y = { 5, 6, 7, 8 };

foo2(x, y); // x and y still point to different arrays
}

Here's someone else who used "pass reference [type] by reference" and "pass reference [type] by value" in the same way as I did above. I'm not saying they're right or wrong. Initially when I started posting in this thread, I considered "pass by value" to have the same meaning that you are ascribing to it. But I thought about it some more (especially considering that someone said "Monkey C is 'pass by value'") and looked at the terminology that (some) others are using (especially that fact that some say "JavaScript is 'pass by value'".)

https://stackoverflow.com/a/45994242

What a confusing use of terms!

To clarify,

for a method foo(int[] myArray), "passing a reference (object) by value" actually means "passing a copy of the object's address (reference)". The value of this 'copy', ie. myArray, is initially the Address (reference) of the original object, meaning it points to the original object. Hence, any change to the content pointed to by myArray will affect the content of the original object.

However, since the 'value' of myArray itself is a copy, any change to this 'value' will not affect the original object nor its contents.
for a method foo(ref int[] refArray), "passing a reference (object) by reference" means "passing the object's address (reference) itself (not a copy)". That means refArray is actually the original address of the object itself, not a copy. Hence, any change to the 'value' of refArray, or the content pointed to by refArray is a direct change on the original object itself.

I will say that if we use "pass by value" to refer to function parameter semantics which state that a function can't change the value of a parameter (whether that "value" is a pointer, reference or primitive value), then calling a language "pass by value" is probably pretty facile since there probably aren't any popular languages that aren't "pass-by-value" by that definition. (I guess you could say C# and C++ are pass-by-value by default, but allow you to choose pass-by-reference semantics when you use the ref args language feature.)

To play devil's advocate, let's say my current usage of "pass by value" and "pass by reference" is completely wrong.

What terminology would you use to differentiate the behavior of foo() and foo2() above, with respect to a? Especially given the fact that C# uses the ref keyword here?

dpawlyk said:
int I; // this is a reference. I is a reference to memory with a value. You can change what's stored there (the value of I) but you can't make I refer to some other memory address).

That may be technically and formally correct (*), but when people ask questions such as "Does C have references?", they probably mean the C++ language feature referred to as "references" (reference variables).

e.g.

int a = 42;
int& b = a; // b is now an alias for a

Of course, "references" in Java, C#, JavaScript and Python don't work quite this way. And ppl describe Python as "call by assignment" instead of call-by-value or call-by-reference.

It's funny because some would say that a pointer is a kind of "reference" in C (loosely speaking). Hence the term "dereferencing a pointer." (I am aware of the differences between C++ references and C/C++ pointers, of course.)

All of this is to say I don't think any of the POVs are right or wrong but clearly some of the concepts are fuzzy, ambiguous and context-dependent.

I've worked with people who coded in C for decades, and looked at me blankly when I said things like "when you dereference this pointer..." I have a feeling they would not accept your definition of a reference. (That doesn't make it incorrect.)

(*) As I haven't read the formal spec, I can't say what's "correct" or not myself. If we want to get technical, I would say that in your example -- "int l" -- l is an lvalue. Yes, I realize that a C++ reference variable is simply an alias for an lvalue, which probably lends credence to the argument that in C, an lvalue is a kind of reference. But I still claim that it's not what most people would think of when they say "reference". And yes, you can say that an lvalue is an expression that refers to a memory location (and you can get the address of that memory location using the & operator) but I still don't like to think of that as a "reference", because I think it muddies the waters even further.

I do think it's fair to say that "C references" (in your parlance), C++ reference variables / C# "ref locals", C#/Python/Java/JavaScript references all refer (ha!) to related concepts which are not identical.

Ref args in C++ and C# is yet another related concept, but not exactly the same as the others. The stackoverflow comment I pasted above hopefully explains it adequately.

0 flowstate over 3 years ago in reply to flowstate

Another data point from a MSFT engineer:

https://devblogs.microsoft.com/premier-developer/performance-traps-of-ref-locals-and-ref-returns-in-c/

The C# language from the very first version supported passing arguments by value or by reference.

Clearly (from context and by considering the other things MSFT has written), he's referring to the difference between

void foo(ref T a)

and

void foo2(T a)

regardless of what T itself is (a so-called "reference type" such as int[] or a so-called "value type" such as int). e.g. Suppose T is int[] -- I believe that dev would still refer to foo2 as "passing a by value".

0 jim_m_58 over 3 years ago

Remember in Monkey C you're not dealing directly with pointers, numbers, etc.. Just objects. In the case of an array, it's a collection of other objects. So. "x" above is put on the call stack and a copy is seen in f() as "arr", but it's a different object than "x" (a copy), but the elements of "arr" are exactly the same objects as in "x" . So when you modify one of those objects using "arr", you see that in "x".