23 Nov, 2008, David Haley wrote in the 21st comment:
Votes: 0
So it seems to me that the whole problem here is that the array iteration was doing something it wasn't supposed to be doing, not that the toolkit was doing something it wasn't supposed to be doing.

Re: C: I think it's a little different. What I meant is that Javascript has this magic non-enumerable flag, but that you can't control. So there are hidden language mechanisms that directly affect you. C doesn't have magic like that: everything is exposed. It doesn't pretend to give you fancy features that "just work". This non-enumerable flag just smells incredibly fishy to me.

elanthis said:
(Being a C/C++ junkie, I also quite like it when the language just figures it out on its own based on where the variable is first declared.)

This is not an approach well suited to dynamic languages where you make heavy use of closures, or where lexical scope isn't really as meaningful.

————-

Re: JQuery: thanks for the explanation. I still can't help but think that this could be addressed if the language semantics were saner.

In Lua, for example, you add non-enumerable properties by simply adding them to the object's metatable. In Javascript terms, that would be the prototype. If you wanted to keep the current prototype around, you would introduce a new metatable in between. The "for each" loop has extremely simple semantics with no magic happening: it loops over every single key present in a table. This does not iterate over methods or other "non-contained" properties because, well, they aren't contained in the table, but in its metatable. (I suppose I need to be fair: the only magic property is the metatable, which is something that every Lua object (or in some cases type) will have, that you cannot access directly as a property anyhow.) This non-enumerable stuff bugs me and kind of smells of an afterthought-type fix.
23 Nov, 2008, elanthis wrote in the 22nd comment:
Votes: 0
DavidHaley said:
So it seems to me that the whole problem here is that the array iteration was doing something it wasn't supposed to be doing, not that the toolkit was doing something it wasn't supposed to be doing.


That's fair. I would argue that any JavaScript toolkit should just take into account that there is a _lot_ of code in the wild that misuses for…in, and that it's better to play along with existing code rather than introducing incompatibilities.

Quote
Re: C: I think it's a little different. What I meant is that Javascript has this magic non-enumerable flag, but that you can't control. So there are hidden language mechanisms that directly affect you. C doesn't have magic like that: everything is exposed. It doesn't pretend to give you fancy features that "just work". This non-enumerable flag just smells incredibly fishy to me.


It is fishy, yes. It's another example of something that was abused and misused by implementations, and ended up becoming a part of future standards since the standard tries to reflect what the implementations are actually doing. I wouldn't at all be surprised if MIcrosoft is to blame for it all. :p It is a misfeature that needs to be addressed in future language revisions, though; that I will whole-heartedly agree with.

Quote
elanthis said:
(Being a C/C++ junkie, I also quite like it when the language just figures it out on its own based on where the variable is first declared.)

This is not an approach well suited to dynamic languages where you make heavy use of closures, or where lexical scope isn't really as meaningful.


It's easier than one might think. C/C++ requires that variables be declared. JavaScript, for all intents and purposes, requires this as well (using the 'var' keyword) in order to provide sane scoping. A slight change in behavior to JavaScript would fix the issue, by simply throwing an exception if an unbound property is assigned to. This is easy to do in every implementation, too; it just isn't done by default for compatibility reasons. I would be quite delighted if JavaScript2 did this, however. It won't even be a compatibility problem in the slightest, as code inside JavaScript 1.x blocks would just retain the old behavior by using a different assignment opcode. It's like a 20 line patch tops to SpiderMonkey.

Closures have little to do with it, as all the variable binding is determined by the point of closure definition, not invocation. The change I suggest would have absolutely no effect on existing closures, other than requiring a closure to use something like Global.foo = bar to define a new global property.

Actually, that reminds me of the last real JavaScript wart I can think of. By default, implementations use the Object object as the global environment. Meaning that all global variables become properties on all objects (because everything inherits from Object). That is pretty dumb. It's technically a browser environment issue, not an actual JavaScript issue (when you embed a JavaScript engine, you can use a unique global object if you so wish, and then bind that object to its own Global property), but it's still annoying that one has to put up with it in browser code.

Quote
Re: JQuery: thanks for the explanation. I still can't help but think that this could be addressed if the language semantics were saner.


If you're not ever going to mix code with any third party code, then extending the default types is perfectly legitimate. I do it in jsmud, and I do it in the Apache mod_js component I'm working on. MooTools and Prototype and jQuery are not made for these environments, though. They're made for plugging into client-side web pages served inside a browser which is already running in weird back-compat modes to put up with Microsoft bugs or other oddities that existing, working code relies on. A toolkit made for this environment should play well with this environment, and that means preserving both the "correct" and the real world code out there. Otherwise you end up with this thread. :)

Quote
In Lua, for example, you add non-enumerable properties by simply adding them to the object's metatable. In Javascript terms, that would be the prototype. If you wanted to keep the current prototype around, you would introduce a new metatable in between.


Prototypes and metatables aren't really meant to serve the same purposes. Metatables are used to extend behavior of a table. Prototypes are meant solely as an inheritance mechanism. Totally different concepts. You need to set the __index property of a metatable to another table in Lua to emulate JavaScript prototypes.

Lua's mechanism is certainly more powerful. There is no way to alter the behavior of a JavaScript object like you can in Lua. The magic prototype property and length properties are indeed a little confusing, as it means that if you try to use a JavaScript object as a string->value map (which is what they are), you can't use the string 'prototype' without causing problems. That is bad. I'd have much preferred global functions like getPrototype(obj)/setPrototype(obj, proto) and length(obj) over the magic properties, but we're stuck with the original design until JavaScript2 (which is unlikely to fix this, unfortunately).

It _is_ dumb that implementations started using non-enumerable flags on all built-in methods of an object. I have a feeling they did this because old implementations didn't expose native methods as Function objects (all modern implementations do), but I'm not sure.

Quote
The "for each" loop has extremely simple semantics with no magic happening: it loops over every single key present in a table. This does not iterate over methods or other "non-contained" properties because, well, they aren't contained in the table, but in its metatable. (I suppose I need to be fair: the only magic property is the metatable, which is something that every Lua object (or in some cases type) will have, that you cannot access directly as a property anyhow.) This non-enumerable stuff bugs me and kind of smells of an afterthought-type fix.


Lua definitely has a stronger design. I will freely admit that I tend to like JavaScript just slightly more as I am just more comfortable with the C-like syntax (yes, I'm ashamed to admit, I'm a syntaxist, and one with poor taste at that). I also have never liked languages that try to mix objects and arrays into a single mechanism, even though Lua does it quite well. The whole indexes starting at 1 thing also just rubs me the wrong way.

I've considered forking Lua, giving it curly braces, changing the indexing to use 0 (mostly a library issue), replacing 'local' with 'var' and making it mandatory for unbound variables, and then getting rid of the # operator (# should start a comment line, darnit!) and replacing it with something else, and then releasing it as Cua. :) Every time I open up the Lua parser code I throw up a little in my mouth though and give up on the idea. The Lua parser is pretty ugly (inconsistent naming schemes, no comments, etc.) and is very directly tied to code generation with no intermediate form, making parser hacking rather unpleasant. Granted, it was done that way for a good reason, as Lua parsing is incredibly fast, even for ginormous input files that would choke almost every other parser/compiler.

The saddest thing is that JavaScript2 apparently is aiming to add real classes to JavaScript instead of just fixing up the prototype warts. :( JavaScript really is an excellent language, aside from a few problems, and it's saddening to see the idiots in charge deciding to just add more complexity and more corner cases and warts instead of just fixing up a relatively small number of design flaws. Ah well.
24 Nov, 2008, David Haley wrote in the 23rd comment:
Votes: 0
elanthis said:
That's fair. I would argue that any JavaScript toolkit should just take into account that there is a _lot_ of code in the wild that misuses for…in, and that it's better to play along with existing code rather than introducing incompatibilities.

I suppose so, but I bemoan this for pretty much the same reason that I bemoan the idea of designing standards to fit funky implementations: if something is behaving oddly, you don't officialize that oddity, you try to get people to fix it.

elanthis said:
Closures have little to do with it, as all the variable binding is determined by the point of closure definition, not invocation.

I was thinking about languages that have dynamic lexical scoping, for example. Or languages that allow you to change a function's environment. In those cases, point of invocation can actually matter a great deal.

elanthis said:
Prototypes and metatables aren't really meant to serve the same purposes. Metatables are used to extend behavior of a table. Prototypes are meant solely as an inheritance mechanism. Totally different concepts. You need to set the __index property of a metatable to another table in Lua to emulate JavaScript prototypes.

I think you sort of said this below the above quote, but I would argue that a metatable is actually a similar concept, but a superset of what prototypes do. With metatables, you can implement prototype-based inheritance; you can also implement completely different things. Metatables are some of the most powerful programming tools I've ever used, although it takes a while to wrap one's head around just how powerful they are.

Interestingly enough (but not terribly surprisingly) the idea is not new at all. SmallTalk lets you define how to handle unknown messages, which is (more or less) what metatables do.

elanthis said:
I also have never liked languages that try to mix objects and arrays into a single mechanism, even though Lua does it quite well.

It's funny, I guess I've worked in Lua for so long that this doesn't even strike me as an issue. It took me a while to figure out what you meant. :wink: But you're right: Lua is so typeless that, and imposes so few restrictions, that it's easy to get a little lost.

One of my main gripes with Lua – that isn't actually a gripe with Lua per se but with any typeless language – is that the vast majority of programming actually should be typed, and you gain from dynamic types in only a rather small number of cases. On the other hand, having static typing can save you over and over again as the compiler can check that you're not doing something stupid like calling "bark" on an object that you know is a bird.

elanthis said:
The whole indexes starting at 1 thing also just rubs me the wrong way.

Eh, I guess I just got used to this too. It actually makes some things more intuitive, if you're willing to unlearn the C-land dogma of zero being the first element. (And I say that without any tongue-in-cheek: it is only relatively recently in my programming career that I used anything without 0-based indexing.)

elanthis said:
I've considered forking Lua (…)

In my experience, having different syntax is very valuable for making me realize that I am in truly different programming languages, so I don't try to do things from another language. The different syntax is a problem only for the first few days, really, then I more or less stop seeing the syntax consciously. When I go from C++ to perl to tcl (don't ask) to Lua to Python, I very quickly know what's going on just by absorbing the different syntax. Java is the exception because it looks so much like C++, except that I do all of my Java coding in Eclipe so that gives me my cue.

Unsurprisingly, your suggestion has come up several times on the Lua mailing list, and the arguments I've heard against it were actually pretty convincing, to me at least (the syntactical cue being one of them).

You could of course accomplish your goal without even touching the parser, by using token filters.

elanthis said:
JavaScript really is an excellent language, aside from a few problems, and it's saddening to see the idiots in charge deciding to just add more complexity and more corner cases and warts instead of just fixing up a relatively small number of design flaws.

JavaScript and Lua are remarkably similar languages in their semantics and how you express various things. I agree that it's rather unfortunate that JavaScript is heading in this direction instead of cleaning things up and simplifying.

It's not that Lua couldn't use an officially sanctioned OOP model that modules and libraries could agree upon – it's just that it's not clear that such a thing belongs in the core of the language, which should be kept as straightforward as possible.
24 Nov, 2008, elanthis wrote in the 24th comment:
Votes: 0
Quote
I suppose so, but I bemoan this for pretty much the same reason that I bemoan the idea of designing standards to fit funky implementations: if something is behaving oddly, you don't officialize that oddity, you try to get people to fix it.


I do agree. Idealism and reality don't meet too often, unfortunately. It's often not even a question of fixing implementations – around 25% of Internet users still use IE6. Microsoft could make IE8 the bestest browser ever and it wouldn't mean squat. All those cool features Firefox offers – the newer JavaScript revisions with various fixes and improvements (like adding many of those methods toolkits like Mootools offer into the built-in objects) – are totally and utterly useless to developers, because at the end of the day, they can't ignore all the sods who refuse to upgrade their ancient software to something released in the last decade. So even if the standards body says something is wrong and that implementations should change, it would just end up meaning that developers now have two incompatible systems to target.

And then of course you do get the corporate inertia. Microsoft is finally moving, but they're not moving very quickly. You can beg and plead and make all the cases you want, but at the end of the day, you're a slave to what the vendors want to do. Making a JavaScript standard that doesn't reflect what the implementations do is useless. At best you end up with a standard that nobody uses, and at worst you cause a fracture between the implementations that follow your standard and the ones that decide to just keep doing things their own way.

This actually happened in the JavaScript camp rather recently. The standards group working on ECMAScript4 was trying to push some more radical changes, but Microsoft and a couple of the big service and toolkit vendors (Yahoo being one of the bigger ones, if I recall) came back and said, "we're not going to implement this." The standards body really didn't have much choice other than to scrap a lot of their ideas and go back to the drawing board to come up with something that everyone would actually follow, and in the process drop some of the more exciting (and also some of the more frightening) ideas they originally had. The alternative would've been to have MS-JavaScript and EveryoneElse-JavaScript, which wouldn't have done anybody any good.

Quote
I was thinking about languages that have dynamic lexical scoping, for example. Or languages that allow you to change a function's environment. In those cases, point of invocation can actually matter a great deal.


If you took my meaning to be that the language should figure it out at compile time (my mistake), then yes. I meant to say that it should be figured out at run-time, but with hints from the syntax to let the run-time know what to do. My biggest complaint with JavaScript, Lua, and a few other languages is that a variable declaration can accidentally pollute the global scope, and the syntax makes it easier to accidentally do that than it does to do what you intended. Not good.

What I want is that if a variable is assigned to without using var/local, and that variable does not already exist somewhere in the environment chain, then either raise an error or at least create the variable in the local scope instead of the global environment. When referencing an undefined variable as a rvalue, do not raise an error, but instead return undefined/nil. Both JavaScript and Lua already provide ways to explicitly create new global variables from inside a function or closure, so there's no decrease of functionality; just a decrease in unintentional global scope pollution. :)

Quote
It's funny, I guess I've worked in Lua for so long that this doesn't even strike me as an issue. It took me a while to figure out what you meant. :wink: But you're right: Lua is so typeless that, and imposes so few restrictions, that it's easy to get a little lost.


It's not about getting lost. I do a lot of work in PHP (yes, I want to hang myself) which does things the same way as Lua in this regard. I just don't like it. :)

I don't think I started to have issues with it until years after I first learned either language. I have to sometimes transmit data between systems written in languages that have separate list and map types, or using protocols that have separate types. E.g., JSON notation or XML-RPC. I either end up having to write explicit serialization code for the every complex data type (that knows which tables to serialize as a list and which to serialize as a map/structure) or I have to rely on icky tricks to automagically figure things out and still fail for some cases.

I can't think of any situation that I've really received any benefit out of having tables representing both an ordered list and an unordered map at the same time. I'm not convinced its any simpler in the language internals, either, given some of the funky stuff Lua does to optimize tables. It might actually simplify things a bit if the internals treated a table as either a list or as a map, but never both. Not sure. Granted, that would break existing code using table constructor syntax to dynamically create sparse lists, so it's not a change I'd actually suggest Lua to make.

Quote
One of my main gripes with Lua – that isn't actually a gripe with Lua per se but with any typeless language – is that the vast majority of programming actually should be typed, and you gain from dynamic types in only a rather small number of cases. On the other hand, having static typing can save you over and over again as the compiler can check that you're not doing something stupid like calling "bark" on an object that you know is a bird.


Don't say that around a "pro" Python programmer. ;) They'll bite your head off, call you stupid, claim that you're obviously a Java programmer who needs hand-holding by the compiler, and that everyone knows dynamic is better and static is dead. I really hate talking to Python diehards sometimes.

I'm fairly sure that the only place I need static typing is in module interfaces. When I declare a variable, I really don't care what the type is. I don't mind if it's statically typed so long as the compiler just figures it out for me. I want my method signatures to clearly state the types so external code is forced to play by the rules my module operates under, though.

The Python apologists will claim that I just need to write 20 lines of unit tests for every line of actually useful code, though. Why have the compiler enforce a rule with a single additional word in the source when you can write a unit test for your code (and every external call point that invokes your code) to make sure compatible types are used? Feh. N00bs.

While I appreciate the theoretical uses of "duck typing" and while I have been bitten a couple times by static systems that require specific interfaces and proxy objects to get everything working, I agree with your assessment of how often I really need those dynamic features and how often the static typing saves me pain. There's nothing quite so fun as code that passes in some random half-compatible type to a module which stores the value away until its needed, and then the application crashes at some future point because that value doesn't have some specific method you expected to be available. Even with extensive unit tests, tracking down that kind of problem is just a pain in the rear. It's the dynamic language equivalent of a dangling pointer.

elanthis said:
Eh, I guess I just got used to this too. It actually makes some things more intuitive, if you're willing to unlearn the C-land dogma of zero being the first element.


If I only had to work in Lua, I wouldn't mind it at all. Having to use one mental model for Lua and another for everything else is the annoying part. Whatever conceptual benefits Lua's approach might have, in the grand scheme of things it's just one more inconsistency that makes mixing code and data between languages a pain in the butt.

I really wouldn't call Lua's choice "intuitive." Every interface and concept a human works with is something he has to learn (aside from the nipple, as the joke goes). Even if the reasons for C's zero-based indexing are irrelevant in some particular language, that doesn't negate the fact that just about every programming language counts from zero. Thus the most intuitive thing for a programming language to do is to follow common convention and start from zero. Every little "gotcha" reduces how intuitive something is to learn. Starting from zero is a "gotcha" for someone who's never used any other programming language, but starting from one is a "gotcha" for anyone who has.

Given what Lua's original design called for, starting from one made a certain amount of sense. Given how it's most often used today, though, I definitely feel that having it start with one is an annoying flaw.

Quote
In my experience, having different syntax is very valuable for making me realize that I am in truly different programming languages, so I don't try to do things from another language.


Meh, not generally a problem, personally. The syntax is just window dressing over the concepts the language embodies. I just happen to like a particular color of curtain. ;) (And extending metaphors like a torture victim on a rack, apparently. … case in point.)

I've experimented with a _lot_ of languages over the years, and experimented with a lot of custom language designs. There are practically an infinite number of syntax approaches one can take for almost every part of the language. For whatever reason, I just happen prefer the look of curly braces and semicolons, independent of any other part of the syntax (e.g., how identifiers are formed, or how function calls look, or assignment operators, or loop statements, etc.). It's pointless and just a matter of preference, I guess. I get equally picky over the colors a highlighting code editor uses – strings should be red, not blue, dammit. :)

Quote
You could of course accomplish your goal without even touching the parser, by using token filters.


I could make the syntax alterations, yes. It's the other changes I'd want to make that would require changes to the internals.

The do-it-yourself-er in me kinda just wants to write it all from scratch anyway. I'd really like to play around with some of those new interpreter/compiler techniques the JavaScript folks have been raving about lately, and it's so much more potent of a learning experience when you implement it yourself rather than just tweaking an existing system.

NOTE: I do want to warn anyone reading this that while "doing it yourself" is great as a learning mechanism, it is quite atrocious as a production-quality development approach. Use the existing tools if you're trying to build real software. Otherwise you end up with something that takes five times longer to design and implement, takes five times as much manpower and resources to maintain, and is probably five times buggier and slower. I learned that lesson the hard way in years past, so please save yourself the pain and take my advice. :)
24 Nov, 2008, Tyche wrote in the 25th comment:
Votes: 0
Bah! Everyone knows static typing is so yesterday.
The future belongs to monkey patching and duck typing!
24 Nov, 2008, Scandum wrote in the 26th comment:
Votes: 0
elanthis said:
Given what Lua's original design called for, starting from one made a certain amount of sense. Given how it's most often used today, though, I definitely feel that having it start with one is an annoying flaw.

Doesn't Lua allow using -1 to refer to the last index? If so it makes sense to start at 1 because you can return 0 as the error value given all other values are valid. And I assume that someone used to Lua would be ticked off at the countless times you have to use -1 in C to get a loop working.
24 Nov, 2008, David Haley wrote in the 27th comment:
Votes: 0
elanthis said:
Idealism and reality don't meet too often, unfortunately.

Unfortunately, yes. I think that JavaScript is a good example of a language that was let loose without really being ready, and as a result it's turned into this mess where implementations are competing with standards which are themselves competing with decade-old implementations. Pretty sad state of affairs IMO.

elanthis said:
What I want is that if a variable is assigned to without using var/local, and that variable does not already exist somewhere in the environment chain, then either raise an error or at least create the variable in the local scope instead of the global environment.

It is impossible to know this at compile time (due to the environment chain being dynamic) and if you did it at runtime you would lose all of the efficiency of local variables – they do not require a symbol-table (environment chain) lookup.

I acknowledge that language design should not necessarily be driven by implementation concerns, but local-vs.-global-by-default is something of a toss-up IMO – it's mainly preference. In that case, I'd go with the more efficient solution.

elanthis said:
When referencing an undefined variable as a rvalue, do not raise an error, but instead return undefined/nil.

Now this I actually think is pretty dangerous. This can make typos disappear silently, which in my experience is something that you really don't want.

Note incidentally that Lua detects undefined variables using constructs fully available to the programmer (metatables). Pretty darn cool IMO. (see 'strict.lua')

elanthis said:
I can't think of any situation that I've really received any benefit out of having tables representing both an ordered list and an unordered map at the same time.

Well, serialization is one of them. You can treat every table as key=value pairs. You don't need to worry about whether those keys happen to be a sequence of numbers. It's not so much a question of simplicity in the internals as it is from the programmer's persective.

elanthis said:
It might actually simplify things a bit if the internals treated a table as either a list or as a map, but never both

This is already the case, actually. The implementation applies optimizations for tables that happen to only contain keys that are sequential (without too many holes). As soon as you break that assumption (by introducing a large enough hole, or a non-integral key) it reverts to full table mode.

elanthis said:
Don't say that around a "pro" Python programmer. ;) They'll bite your head off, call you stupid, claim that you're obviously a Java programmer who needs hand-holding by the compiler, and that everyone knows dynamic is better and static is dead. I really hate talking to Python diehards sometimes.

Ungh. If somebody tried pulling that with me I'd likely smack them up the head :biggrin:

elanthis said:
When I declare a variable, I really don't care what the type is. I don't mind if it's statically typed so long as the compiler just figures it out for me.

I think this is the crux of it, really. The problem with static typing is having to specify every last type. And that is why I am growing extraordinarily fond of Scala, because it is statically typed but completely type-inferred. So you don't have to worry about specifying every last type. It can do great things like specify types with wildcards: "some type that implements method foo", giving you the advantages of "(static) duck typing" when you need it – sort of similar to Java generics. (Perhaps not coincidentally, the author of Scala is also the guy who designed Java generics.)

elanthis said:
Even with extensive unit tests

Part of the argument made in favor of dynamically typed languages is that you don't have to waste time dealing with types and as a result are much more productive. Unfortunately, that doesn't mesh too well with having to write unit tests that are basically doing what the compiler should already be doing for you! :wink:

elanthis said:
Given how it's most often used today, though, I definitely feel that having it start with one is an annoying flaw.

Well, as you said, this isn't enforced by anything other than a few library functions. You could very easily start all arrays from 0. Of course, you'd have to adapt if you ever talked to modules that made the assumption. But really, even then, in my experience you rarely directly index into arrays in Lua. You either iterate over the whole list, or you access an index that was found by another function that iterated.

elanthis said:
I get equally picky over the colors a highlighting code editor uses – strings should be red, not blue, dammit. :)

Uh, actually, they should be green, thankyouverymuch. :lol:

elanthis said:
NOTE: I do want to warn anyone reading this that while "doing it yourself" is great as a learning mechanism, it is quite atrocious as a production-quality development approach.

Not to be too corny, but I agree 1,000%. If you're not doing it to learn, you had better have an extraordinarily good reason to design and implement a whole new language that's actually useful (beyond, say, mudprog). It is a hard thing to do, and most people fail miserably after having made just enough progress to be hopeful that things will turn out ok.

Scandum said:
Doesn't Lua allow using -1 to refer to the last index?

Not really. It does this for strings, as parameters to string library functions. If you used -1 with a table, you would just be looking up whatever is associated with the key "-1".

Conceptually, Lua arrays are just associative maps whose keys happen to all be numbers more-or-less in sequence.

For error values, "nil" is more-or-less the standard way of saying "does not exist".
24 Nov, 2008, elanthis wrote in the 28th comment:
Votes: 0
Quote
It is impossible to know this at compile time (due to the environment chain being dynamic) and if you did it at runtime you would lose all of the efficiency of local variables – they do not require a symbol-table (environment chain) lookup.


Sorry, I thought I made it pretty clear I meant to do this at _runtime_.

Quote
Now this I actually think is pretty dangerous. This can make typos disappear silently, which in my experience is something that you really don't want.


It also will cause massive breakage in those very same highly dynamic setups you were mentioning. It's also worth noting that JavaScript and Lua both work exactly like that already, unless you add in hooks/handlers for catching it yourself. I was just clarifying that I didn't want that behavior to change.

elanthis said:
Well, serialization is one of them. You can treat every table as key=value pairs. You don't need to worry about whether those keys happen to be a sequence of numbers. It's not so much a question of simplicity in the internals as it is from the programmer's persective.


Except it breaks when you're transmitting data to another system that actually cares if it's a list or not. Interoperability matters a _lot_ to me.

elanthis said:
This is already the case, actually. The implementation applies optimizations for tables that happen to only contain keys that are sequential (without too many holes). As soon as you break that assumption (by introducing a large enough hole, or a non-integral key) it reverts to full table mode.


Hmm. I was pretty sure that a table could contain both data in regular table mode and data in the fast indexed array mode. My mistake.

elanthis said:
Ungh. If somebody tried pulling that with me I'd likely smack them up the head :biggrin:


As soon as someone invents a way to do that over the Internet, I will be the happiest man in the world. ;)

Quote
I think this is the crux of it, really. The problem with static typing is having to specify every last type. And that is why I am growing extraordinarily fond of Scala, because it is statically typed but completely type-inferred. So you don't have to worry about specifying every last type. It can do great things like specify types with wildcards: "some type that implements method foo", giving you the advantages of "(static) duck typing" when you need it – sort of similar to Java generics. (Perhaps not coincidentally, the author of Scala is also the guy who designed Java generics.)


Yup. There's plenty of languages that do that. Incidentally, you can actually do it with GCC (and possibly G++) too, using the typeof() extension and a little macro love to make it easier to type (albeit, uglier). I'll be quite happy when C++0x is out, too, since it makes the 'auto' type use this kind of behavior by default. And C++ is the language that _really_ needs type inference the most. std::vector<std::pair<std::string, my_type<other_type> > > >::iterator…. ugh.

elanthis said:
Well, as you said, this isn't enforced by anything other than a few library functions. You could very easily start all arrays from 0. Of course, you'd have to adapt if you ever talked to modules that made the assumption.


The # operator would not work correctly, iirc. It just looks for the largest index.

Quote
But really, even then, in my experience you rarely directly index into arrays in Lua. You either iterate over the whole list, or you access an index that was found by another function that iterated.


Yeah, it's not a constant issue. It's an issue just often enough to bug me, though. :)

Quote
Uh, actually, they should be green, thankyouverymuch. :lol:


Oh, it's on! You and me, 3 o'clock, behind the old church. Bring your gun, a priest, and a long wooden box.
25 Nov, 2008, David Haley wrote in the 29th comment:
Votes: 0
elanthis said:
Sorry, I thought I made it pretty clear I meant to do this at _runtime_.

My bad then. In that case, you still use the (rather significant) efficiency gains that local variables get you.

elanthis said:
Except it breaks when you're transmitting data to another system that actually cares if it's a list or not. Interoperability matters a _lot_ to me.

I think this is a problem whenever two languages disagree about representation. I'm not sure it's a Lua-specific problem. But, it's fairly easy to determine if a Lua table is in fact a list or a map. (The brute force way is to just check "pairs" and make sure all keys are integers.)

elanthis said:
Hmm. I was pretty sure that a table could contain both data in regular table mode and data in the fast indexed array mode. My mistake.

Well, I'm almost positive that it reverts to full table mode. I know that # will keep "working" but in strange ways. I know that they changed this behavior in between 5.0 and 5.1 – I could be remembering from one version instead of the other. Now I'm curious and will go look it up at some point…

elanthis said:
Yup. There's plenty of languages that do that. Incidentally, you can actually do it with GCC (and possibly G++) too, using the typeof() extension and a little macro love to make it easier to type (albeit, uglier). I'll be quite happy when C++0x is out, too, since it makes the 'auto' type use this kind of behavior by default. And C++ is the language that _really_ needs type inference the most. std::vector<std::pair<std::string, my_type<other_type> > > >::iterator…. ugh.

To be honest, that's probably the single most crippling aspect of the STL, and why the Java container framework – which is really quite similar in the end of the day – is oodles easier to work with.

elanthis said:
(about 0-indexing]The # operator would not work correctly, iirc. It just looks for the largest index.

Indeed, but you can override the # operator using the "len" metamethod.

(And it doesn't quite look for the largest index; if you have holes it could return something below the largest one.)

elanthis said:
Oh, it's on! You and me, 3 o'clock, behind the old church. Bring your gun, a priest, and a long wooden box.

Sure thing, just tell me what your measurements are and what color you'd like me to syntax-color your epitaph with. I'll forgive you if you give me the "wrong" answer considering the circumstances. :tongue:
25 Nov, 2008, elanthis wrote in the 30th comment:
Votes: 0
DavidHaley said:
My bad then. In that case, you still use the (rather significant) efficiency gains that local variables get you.


Sorry, was I implying I wouldn't?

Quote
I think this is a problem whenever two languages disagree about representation. I'm not sure it's a Lua-specific problem. But, it's fairly easy to determine if a Lua table is in fact a list or a map. (The brute force way is to just check "pairs" and make sure all keys are integers.)


Well, yes. My point being that I don't see the advantage in representing things the way Lua (or PHP, or others) do, but I do see the disadvantage in doing things differently from most other languages, and from most "standard" formats.

And the brute-force way really sucks for large lists. :)

Quote
To be honest, that's probably the single most crippling aspect of the STL, and why the Java container framework – which is really quite similar in the end of the day – is oodles easier to work with.


I haven't actually used Java since before generics were added. I know that using Java containers before was actually a pain in the butt. Still ended up using a lot of extra typing due to all the casting.

Quote
Indeed, but you can override the # operator using the "len" metamethod.


Is there a (not grossly inefficient) way to automatically set metatables for every single table, including those created using literal table constructors or created from C modules, in such a way that won't conflict with the metatables other modules are setting?

Quote
(And it doesn't quite look for the largest index; if you have holes it could return something below the largest one.)


Huh. I thought it was supposed to work even for sparse lists. Mark that up as a point in favor of JavaScript – most implementations support sparse Array objects, and they have an always-working length property. ;) And yes, while taking the length of a table in Lua is necessary less often, it is still necessary sometimes. :)

Quote
Sure thing, just tell me what your measurements are and what color you'd like me to syntax-color your epitaph with. I'll forgive you if you give me the "wrong" answer considering the circumstances. :tongue:


:)
25 Nov, 2008, David Haley wrote in the 31st comment:
Votes: 0
elanthis said:
DavidHaley said:
My bad then. In that case, you still use the (rather significant) efficiency gains that local variables get you.


Sorry, was I implying I wouldn't?

By "use" above I meant "lose"… not sure if you read it as "use" (which is what I wrote…) or "lose" (which is what I meant!). :redface:

elanthis said:
Well, yes. My point being that I don't see the advantage in representing things the way Lua (or PHP, or others) do, but I do see the disadvantage in doing things differently from most other languages, and from most "standard" formats.

The advantage is that the programmer doesn't have to worry about anything if they're just working in Lua. I agree that it can be a potentially very annoying issue if you have to shuttle data back and forth.

elanthis said:
And the brute-force way really sucks for large lists. :)

What I would do, if I was working in a framework where this mattered, would be to just wrap my tables in container class abstractions that enforce this kind of stuff. Then I can ask the container to serialize itself, as opposed to having to look at it and figure out how to do so.

elanthis said:
I haven't actually used Java since before generics were added. I know that using Java containers before was actually a pain in the butt. Still ended up using a lot of extra typing due to all the casting.

Generics fixed this to a very large extent. I hated using it before generics – all this dumb casting. Generics have their flaws, of course, but they make life a lot easier.

elanthis said:
Is there a (not grossly inefficient) way to automatically set metatables for every single table, including those created using literal table constructors or created from C modules, in such a way that won't conflict with the metatables other modules are setting?

Probably not. There is a single metatable for strings, but tables can each have their own metatable. You could make your own constructor function that would do the job, though, and could even be syntactically quite nice. E.g.

t = list({1,2,3})


Presumably you don't want to do this for all tables, just ones that act like lists.

elanthis said:
Huh. I thought it was supposed to work even for sparse lists. Mark that up as a point in favor of JavaScript – most implementations support sparse Array objects, and they have an always-working length property. ;) And yes, while taking the length of a table in Lua is necessary less often, it is still necessary sometimes. :)

It does work on sparse lists to some extent. It uses heuristics to figure out if you hole is "too big". For instance, adding 1,2,5 is probably ok; adding 1,2,1000 will probably break the # operator (and revert to non-array table mode, actually).

To be honest, I can think of probably only one occasion where I've actually needed the number of elements in a table. In the vast majority of cases, I am either iterating over all (potentially breaking) or testing for emptiness.

Again, though, if you really cared about length, you'd just make your own "Array" class. Lua provides mechanisms, not standards; this is a double-edged sword, of course.
25 Nov, 2008, elanthis wrote in the 32nd comment:
Votes: 0
Quote
By "use" above I meant "lose"… not sure if you read it as "use" (which is what I wrote…) or "lose" (which is what I meant!). :redface:


I think we're still talking about something else entirely. :) I definitely do not want to lose any of the benefits of local variables. The whole crux of the issue is that Lua and JavaScript both make it easier to accidentally make a variable global than it is to make the variable local (like you want 99% of the time).

Quote
What I would do, if I was working in a framework where this mattered, would be to just wrap my tables in container class abstractions that enforce this kind of stuff. Then I can ask the container to serialize itself, as opposed to having to look at it and figure out how to do so.


That's one approach. That requires me to change every piece of code that is going to create a list that I may at some point want to serialize to make it use the wrapper table, or do all the wrapping at serialization time. Which at that point, I might as well just write a serialization routine for the data in question.

Quote
It does work on sparse lists to some extent. It uses heuristics to figure out if you hole is "too big". For instance, adding 1,2,5 is probably ok; adding 1,2,1000 will probably break the # operator (and revert to non-array table mode, actually).


Right. But it _always_ works in JavaScript. The length property is guaranteed to be correct. Even if it only matters once in a blue moon, I like me some 100% guarantees. ;)
25 Nov, 2008, David Haley wrote in the 33rd comment:
Votes: 0
elanthis said:
I think we're still talking about something else entirely. :) I definitely do not want to lose any of the benefits of local variables. The whole crux of the issue is that Lua and JavaScript both make it easier to accidentally make a variable global than it is to make the variable local (like you want 99% of the time).

Oh, I think I got sidetracked there. I guess it just hasn't been an issue to me; I'm so conditioned to having to declare my variables that I do it more or less automatically. Again though this has something that has come up: I remember seeing an argument on the list in favor of global-by-default that I thought was pretty good, but unfortunately I can't remember it now! :redface:

elanthis said:
That's one approach. That requires me to change every piece of code that is going to create a list that I may at some point want to serialize to make it use the wrapper table, or do all the wrapping at serialization time. Which at that point, I might as well just write a serialization routine for the data in question.

I do this most of the time anyhow because I like having nice container abstractions. Sometimes I think that I'm writing code in Lua that perhaps should not be in Lua. I haven't yet found a sweet spot between the C++ host and Lua…

elanthis said:
Right. But it _always_ works in JavaScript. The length property is guaranteed to be correct. Even if it only matters once in a blue moon, I like me some 100% guarantees. ;)

Well, this has come up on the Lua list; you might want to look at the discussions from there. There are good arguments on both sides IMO.
20.0/33