String concatenation feature or bug?

Found this the usual way (meaning the hard way):

Background: 2024-10-13,06:25:21 beginLP _nextPos 0

		l_pos = '_' + _nextPos;
		_util.logTime("requestLogPos " + _nextPos + ", l_pos " + l_pos);
		
Background: 2024-10-13,06:25:21 requestLogPos 0, l_pos _

		l_pos = "_" + _nextPos;
		_util.logTime("requestLogPos " + _nextPos + ", l_pos " + l_pos);

Background: 2024-10-13,06:25:21 requestLogPos 0, l_pos _0

Characters can work, as in this:

	public function httpURL(php, rest) {
		return _host + php + '/' + G_uID + '_' + System.getTimer() + '_' + _httpRetrys + '/' + rest;
	}

		_util.logTime("requestLogPos l_pos " + l_pos + ", l_url " + l_url);
	
Background: 2024-10-13,06:25:21 requestLogPos l_pos logPos_0, l_url https://my.example.com/g1/lp/ffa0f28f1a9e91d2133a05fc4ff987c817fb2ab4_361254406_0/logPos_0

Multiple characters OK, single characters bad and have to use string as workaround. Normally it wouldn't matter except that strings are objects which add up after a while.

  • Gotta say this post is not easy to read. I can make an educated guess that nextPos is a Number, but should I have to? (It could very well be a string, although admittedly that's unlikely.) In the 2nd code snippet, we can guess that _host, php, G_uID, and rest are strings, but _httpRetrys is a Number (maybe? or is it a String.)

    Furthermore, the "test case" (with the broken/unexpected behavior) you provided is extremely limited (_nextPos is 0). This will be important, as you will see later.

    Anyway, I don't think it's a bug, I think the problem here is you are expecting the result of adding a Char and a Number to be the string comprising the given Char concatenated with the given Number (after both are converted to strings.)

    However, what's actually happening is that the result of adding a Char and a Number is the character whose Unicode value is equal to the Char's Unicode value with the Number added to it.

    For example:

    System.println('A' + 0); // Outputs "A"
    System.println('A' + 1); // Outputs "B"
    System.println('A' + 2); // Outputs "C"
    System.println(('A' + 2).toNumber()); // Outputs "67" (the unicode/ascii value for 'C')

    IOW, a character in Monkey C is just a numerical value, with the caveat that when you convert it to a String, the string contains the Unicode character with that value (instead of the number itself). So when you add a character and a number, you just get another character (with the corresponding numerical value after addition.) But when you add a character and a string, you get a string (as expected.)

    (If you had taken a second to come up with some simple reproduction code which wasn't specific to your app and that circumstance in your app (_nextPos == 0), you would've quickly found this out for yourself. To be fair, it's a shame that stuff like this isn't documented. In particular, there's no public Monkey C language specification for devs to examine.)

    Personally I don't see a whole lot of benefit to using characters this way. In some cases, you will either have to convert one of the following to a string: the character itself or the thing that you are adding to the character.

    This will probably cancel out any meagre memory savings that were made by using characters instead of strings, plus it will make your code harder to read and maintain. (For max memory savings / brevity, you could add "" to every character when you want to convert it to a string, instead of using toString(). But is something like 'A' + "" really more efficient than "A"?)

  • Start to compile with stricter type checking, it's useful for beginner programmers. Char != String

  • Start to compile with stricter type checking, it's useful for beginner programmers. Char != String

    I think he knows Char isn't the same as String, he's trying to save memory by using characters instead of strings where possible, except this causes problems when trying to add a Char and a Number while expecting the same result as adding a String and a Number.

    Type checking wouldn't help here - System.println() accepts an object by design. (Clearly it calls toString() on the object before printing it.)

    Besides, in the code he posted, the argument to System.println() is already a string in each case. (Because in each case, strings are being added other types, resulting in a string in the end.)

    To be clear, the type checker can't read the mind of the dev and realize that when they type something like '_' + 0, they (incorrectly) expect it to work just like "_" + 0, but with slightly lower memory consumption.

  • To be absolutely clear (and concise):

    "_" + 0 = "_0" because when you add a String x and some other value y, the result is x concatenated with y.toString()

    '_' + 0 = '_' because when you add a Char x and a Number y, you get the character whose unicode value is x's unicode value + y. This is because a Char is really just a Number under the covers (except when you convert it to a String, you get a string which contains the character itself, instead of its numerical value.)

    e.g.

    "A" + 1 = "A1"

    'A' + 1 = 'B'

    ('A' + 1).toNumber() = 66

  • Type checking wouldn't help here - System.println() accepts an object by design

    But + operator does care. The problem is not with the print.

  • But + operator does care. The problem is not with the print.

    I think I tried to explain (more than once, with examples) that it's perfectly valid to add a Char and a Number. The problem is not with the code, it's with the expectation of what the code should do.

  • yeah, probably, but my recommendation is still valid, for every programmer, especially for beginners: turn on strict type checker! It helps!

    It's hard even try to read the code (as you also pointed out):

    l_pos = '_' + _nextPos;

    What's the type of each pf the 3 parts here? What the developer think it is/should be?

  • Also, to explain why the following code "works as expected" (everything is concatenated):

    _host + php + '/' + G_uID + '_' + System.getTimer() + '_' + _httpRetrys + '/' + rest;

    Assuming _host is a string (this is where a reproducible code example / context would really help):

    _host + php => Concatenation of _host and php (String) - call it A

    A + '/' = A concatenated with '/' (String) - call it B

    B + G_uID => Concatenation of B and G_uID (String) ...

    etc.

    Due to the fact that the leftmost operand is a string, each addition involves either two Strings or a String and some other data type (which can be converted to string), and concatenation is always performed as expected.

  • yeah, probably, but my recommendation is still valid, for every programmer, especially for beginners: turn on strict type checker! It helps!

    Ok, but you originally implied that turning on strict type checking would've caught this error, but it's absolutely not the case.

    Again, the code in question (adding a Char and a Number) is valid, but the developer has misunderstood what it's supposed to do.

    The following code generates neither a warning nor an error at any type-check level (including the max: 3 / strict):

    System.println('A' + 1);

    And how could it? It's valid to add a Char and a Number, and the compiler has no way of knowing that the dev expects different behavior than what will actually happen.

    Perhaps the language should've been designed differently with respect to Chars, but it's too late now, I think.

  • It's hard even try to read the code (as you also pointed out):

    l_pos = '_' + _nextPos;

    What's the type of each pf the 3 parts here? What the developer think it is/should be?

    Well, based on what they posted (the log message and the "unexpected behavior"), _nextPos can only be a Number (0). If it was the string "0" (which is the other possibility based on the first log message), then OP would've seen concatenation as they expected.

    I agree that it's hard to read the code and a bit annoying to guess at a bunch of stuff.

    Again, if some effort had been taken to create a minimal working example (which isn't tied to a specific app / app state and/or stuff that hasn't been posted), OP probably would've discovered the issue for themselves.

    It just so happens that _nextPos's value of 0 is the worst possible value for this situation, because '_' + 0 = '_', so it just looks like something is broken, if you don't realize what's happening. If _nextPos had been 1 or 2 (for example), it probably would've been immediately obvious. If someone was creating a synthetic working example, they probably wouldn't have chosen 0 as a number to add to a character when testing the behavior of addition.