Design‎ > ‎

Keywords and Names

We need clear names for:
  • keywords in the language
  • standard modules
  • standard types
  • standard methods
  • user defined modules, classes, types, variables, etc.
Zimbu is a new language, it's likely that we need more keywords over time.  Also after version 1.0 has been released.  We don't want to break any existing program, thus it must be impossible for a new keyword to be already in used by program.

For example, one of my Python scripts broke when Python 2.6 was introduced.  The script was written in such a way that it was backwards compatible all the way back to Python 1.5, and now it failed.  It turned out that "as" was used as a variable name, and since Python 2.6 it's a keyword.  This required every user of this script to update to a new version.

So how do we pick keywords that won't interfere with names that the user is free to use?

1. Use a specific leader for reserved words, such as '%':
                %class if               # "if" is the name of a class
                   %string string
                   %proc strip()
                     ...
                   }
                }
                %int count = 9
                if mine = %new()        # "mine" is an object of class "if"
                %if !mine.string.empty(%list<%string> words)
                    mine.strip()
                %else
                    done()
                }
        - extra characters to type
        - different from most other languages (although it's obvious)
        + easy to spot language items

2. Use a specific leader for user symbols, such as '$', like PHP:
                class $if
                   string $string
                   proc $strip()
                     ...
                   }
                }
                int count = 9
                $if $mine = new()
                if !$mine.$string.$empty(list<string> words)
                    $mine.$strip()
                else
                    $done()
                }
        - many extra characters to type
        - looks strange when used for class names

3. Use a capital, all user symbols must start with lower case:
                Class if
                   String string
                   Proc strip()
                     ...
                   }
                }
                Int count = 9
                if mine = New()
                If !mine.string.empty(List<String> words)
                    mine.strip()
                Else
                    done()
                }
        + straightforward
        - user can't give classes or modules upper case name: Myclass
        - still a bit hard to pinpoint keywords, easy to make mistakes

4. Use all capitals, all user symbols must contain a lower case character:
                CLASS If
                   STRING string
                   PROC strip()
                     ...
                   }
                }
                INT count = 9
                If mine = NEW()
                IF !mine.string.empty(LIST<STRING> words)
                    mine.strip()
                ELSE
                    done()
                }
        + straightforward
        + easier to red back
        - looks like SHOUTING, esp. for classes and module names.
        - it's more difficult to separate type names from keywords
        - a bit harder to type

5. Use all capitals for reserved words, not for type  names.
    This is a compromise: there can still be some name conflicts, but only few.
                CLASS If
                   String string
                   PROC strip()
                     ...
                   }
                }
                Int count = 9
                If mine = NEW()
                IF !mine.string.empty(List<String> words)
                    mine.strip()
                ELSE
                    done()
                }
        + straightforward
        + easier to red back
        o a lot less SHOUTING
        - name conflicts, need to make a list of reserved type names
        - still a bit harder to type

6. Use all capitals for reserved words, type names start with Z.
    This means user types cannot start with a Z, which is reasonable.
                CLASS If
                   Zstring string
                   PROC strip()
                     ...
                   }
                }
                Zint count = 9
                If mine = NEW()
                IF !mine.string.empty(Zlist<Zstring> words)
                    mine.strip()
                ELSE
                    done()
                }
        + straightforward
        + easier to red back
        o a lot less SHOUTING
        - type names are one character longer
        - still a bit harder to type

7. Use all capitals for reserved words, type names start with underscore.
    This means user types cannot start with an underscore.
                CLASS If
                   _String string
                   PROC strip()
                     ...
                   }
                }
                _Int count = 9
                If mine = NEW()
                IF !mine.string.empty(_List<_String> words)
                    mine.strip()
                ELSE
                    done()
                }
        + straightforward
        + easier to red back
        o a lot less SHOUTING
        - type names are one character longer
        - still a bit harder to type

8. Use all capitals for reserved words, type names start with a lower case letter.
    User defined types must start with an upper case letter.
                CLASS If
                   string string
                   PROC strip(list<string> words)
                     ...
                   }
                }
                int count = 9
                If mine = NEW()
                IF !mine.string.empty()
                    mine.strip()
                ELSE
                    done()
                }
        + straightforward
        + still easy to red back (esp. if predefined type names are highlighted)
        o a lot less SHOUTING
        o "string string" can be confusing, but the user can avoid it
        - still marginally harder to type


Choice: 8. make all keywords upper case, predefined type names start with a lower case letter.
                  Disallow names that are made only of upper case characters.
                  Disallow user type names that start with a lower case letter.

Status: implemented

Enforcing a naming style

Code reads much easier when a consistent naming style is used:
- function, method and variable names start with lower case character or '_'
        openFile()
        getName()
        file 
        name
        _msg
- class and enum names start with upper case character.  The second character cannot be an underscore.
        Node
        FilePointer
- interface names start with "I_"
        I_writer
        I_runnable
- exceptions start with "E_"
        E_negativeArg
        E_pastEnd

Enforcing this hardly has a drawback, so let's do that.

Status: implemented

Naming class member variables

When writing a big class it becomes unclear what is a local variable, what
is a method argument and what is a member variable.  Also, a lot of methods
would have an argument that has the same name as a member variable. This
usually leads to adding some character to the argument:

        CLASS Item
          int val
          PROC setVal(int _val)
            val = _val
          }
        }

This doesn't look nice.  Some languages allow the argument name to be equal
to the member variable name (Java) and then require using "this.val"
to access the member.  That leads to mistakes, let's not do that.

1. Use $foo to access object members.  Here "$" is short for "this.", thus elsewhere
   you can still use "myobject.foo".  And it's still not allowed to use "foo"
   for argument or local var to avoid making mistakes.

    CLASS Item
      int val
      PROC setVal(int _val)
        $val = _val  # means: "THIS.val = _val"
      }
    }
    ...
    Item i
    i.val = 8

   + In a long class it's immediately clear what name is a member var.
    - Still need to give arguments another name.

2. Using _val doesn't look nice in docs.  Better have the member vars look
    different.  Require all member variables to start with "_".

    CLASS Item
      int _val
      PROC setVal(int val)
        _val = val
      }
    }
    ...
    Item i
    i._val = 8

   + In a long class it's immediately clear what name is a member var.
    - o._val everywhere, also outside the class

3. Mix of the two above:

    CLASS Item
      int $val
      PROC setVal(int val)
        $val = val  # means: "THIS.val = val"
      }
    }
    ...
    Item i
    i.val = 8

   + Easy to spot member variables in the class
   + No need for "_" in set argument or member var.
  - declaring $val looks strange

4. Like 3, but don't use $ in the declaration.

    CLASS Item
      int val
      PROC setVal(int val)
        $val = val  # means: "THIS.val = val"
      }
    }
    ...
    Item i
    i.val = 8

5. Like 3, but use $ for all object members, including methods.

    CLASS Item
      int $val
      PROC $setVal(int val)
        $val = val  # means: "THIS.val = val"
      }
    }
    ...
    Item i
    i.val = 8

   + Easy to spot member variables in the class
   + Can spot SHARED items, see the difference between class members and object members
   + No need for "_" in set argument or member var.

   + more like what people are used to

Choice: 5

Status: implemented

Methods with predefined semantics

There are standard methods which are useful to have on any class.
When defined they must have the expected arguments and return type.
For example: toString() returns a string, which represents the value of the object.
And equal() checks if the value of two objects are the same.

Since some methods may be added later, the names must be different from methods
that the user is free to add.  Otherwise a next version of the language would break
programs that have a method with a name that now requires the predefined semantics.
So far names like EQUAL and COMPARE were used.  However, this is not consistent
with toString().

An overview of alternatives:
  • Always use all-caps: TOSTRING(), EQUAL(), COMPARE()
    Results in a lot of all-caps, making it more difficult to find keywords such as FOR and IF
  • Prepend a special character, such as underscore (a bit like Python): _toString(), _equal(), _compare()
    Looks OK, but it is a new kind of naming.
  • Let the methods start with an upper case letter: ToString(), Equal(), Compare()
    This works, since methods must start with a lower case letter.
    It is similar to how builtin types start with a lower case letter, only opposite.
  • Use normal method names: toString(), equal(), compare(), but only use them when the signature matches.
    This is like the go language implements interfaces: The methods just happen to match.
    The problem is that when there is a method like equal but it has different semantics, the behavior
    can be totally unexpected.
The third one looks like the best solution.  The meaning of the names will make sure there is no
confusion with class names (which was the reason methods are required to start with a lower case letter).
Users will quickly get used to the slightly different name rule, it's easy to give good error messages.

Status: not implemented yet

Ignore case


To make use easier ignore case of characters after the first one, thus "HuhCap", "Huhcap"
and "HuHcAp" can all be used.  Also: "callthisfunc", "callThisFunc" and
"cALLTHISFUNC" are equivalent.

Status: not implemented yet

Comments