Self-Documenting Code

(lackofimagination.org)

72 points | by tie-in 8 months ago ago

130 comments

johnfn 8 months ago

My cut:

    const passwordRules = [/[a-z]{1,}/, /[A-Z]{1,}/, /[0-9]{1,}/, /\W{1,}/];

    async function createUser(user) {
        const isUserValid = validateUserInput(user);
        const isPasswordValid = user.password.length >= 8 && passwordRules.every((rule) => rule.test(user.password));

        if (!isUserValid) {
            throw new Error(ErrorCodes.USER_VALIDATION_FAILED);
        }


        if (!isPasswordValid) {
            throw new Error(ErrorCodes.INVALID_PASSWORD);
        }

        const userExists = await userService.getUserByEmail(user.email);

        if (userExists) {
            throw new Error(ErrorCodes.USER_EXISTS);
        }

        user.password = await hashPassword(user.password);
        return userService.create(user);
    }

1. Don't use a bunch of tiny functions. This makes it harder for future eng to read the code because they have to keep jumping around the file(s) in order to understand control flow. It's much better to introduce a variable with a clear name.

2. Don't use the `a || throw()` structure. That is not idiomatic JS.

2a. Don't introduce `throwError()`. Again, not idiomatic JS.

3. Use an enum-like object for error codes for clarity.

4. If we must use passwordRules, at least extract it into a global constant. (I don't really like it though; it's a bit too clever. What if you want to enforce a password length minimum? Yes, you could hack a regex for that, but it would be hard to read. Much better would be a list of arrow functions, for instance `(password) => password.length > 8`.

5. Use TypeScript!

[-]

OptionOfT 8 months ago

My issue with this is that you're using exceptions for control flow. A user not being valid is expected (duplicate username). A password not matching a regex is also expected.

Then, in general (not seen here as there are no types), I like to use a lot of types in my code. The incoming user would be of type UnvalidatedUser, whereas the return type of this function would be StoredUser or something like that to distinguish the incoming user type with the outgoing. I like to attach semantics to a type, not to conditions: https://existentialtype.wordpress.com/2011/03/15/boolean-bli...

[-]

aleksiy123 8 months ago

I would add one suggestion/comment. Use a known set of standard error codes and not a unique error code/error type for each new situation.

Error codes are there to hint to clients/callers what action they should potentially take (retry?, backoff)

Don't make callers handle 100s of different potential error codes/types.

If the whole internet can work with 70 your app can work with less.

All of google uses less than 20

https://github.com/googleapis/googleapis/blob/master/google/...

Put more specific information in the error message or a secondary status code.

johnfn 8 months ago

That's a great point, I didn't even think of that. I would use error types as well, yes.

jansommer 8 months ago

It's a good point, especially because different return types is well supported in TS.

I made the same in plpgsql recently and opted for returning an implicit union of UserSession | Error, by returning UserSession in the signature and raising errors in the function. The alternative was to return json where you'd have to look at the body of the function to figure out what it returns (when successful), as opposed to the signature.

I'm not sure if I'm striking the right balance. Yes, the signature is "self-documenting" - until you hit an error!

rerdavies 8 months ago

My issue with that is that absolutely NOTHING will ever convince me that returning error codes is a better idea than throwing exceptions. And that you seem to be using 'expected' in some weird cargo-culty sense of the word. An invalid user name is an error, not an expected case.

[-]

yellowapple 8 months ago

> An invalid user name is an error, not an expected case.

If you ain't expecting users to input bogus data, then you're putting way too much trust in said users.

Put simply: is it a bug in your own code if a user tries to use an invalid username? If yes, then throw an exception. If no, then return an error code. Exceptions represent programmer error; error codes represent user error.

[-]

swatcoder 8 months ago

> Exceptions represent programmer error; error codes represent user error.

No, they represent whatever suits the system design and the house style.

They're just tools to pass data and manipulate control flow.

If you become dogmatic and start insisting on what they must absolutely represent, you're only going to find yourself compromising design coherence and futily railing at colleagues when different representations are found to be appropriate.

With exceptions, it often makes sense to use them when failures inevitably just propogate up through a stack of nested calls or may appear anywhere in a sequence of calls. In these cases, return codes can grossly interfere with legibility and can lead to confusion when interim handlers squelch errors that really should have propogates up directly.

These are the kind of choices that are completely reasonable to make within a specific module (like an auth and accounts module), and you can always return to some other default policy at the interface of that module if your project/house-style needs such.

Engineering shouldn't feel so dogmatic as you suggest. It's a strong smell when it does.

nip 8 months ago

I’ll try:

Returning a “Result” with union discrimination on the “success” is far superior to throwing in Typescript in my experience

   { success: false, reason: “user_already_exists” } | { success: true, data: { user_id: string }

- By design you are forced to “deal” with the sad path: in order to get to the data, you must handle the error state

- Throwing is not type-safe: you cannot know at build time whether you are handling all thrown errors

jagged-chisel 8 months ago

“ Don't use a bunch of tiny functions. This makes it harder for future eng to read the code …”

This is where the naming things bit comes in. You name the function correctly, then when the body is read to understand that it works as named, you can remove that cognitive complexity from your brain and continue on. Once you’ve built trust in the codebase that things do what they claim, you can start getting a top-down view of what the code does.

That is the power of proper abstraction.

[-]

johnfn 8 months ago

I don't know. I really don't see any clarity improvements between, `user.password.length >= 8 && passwordRules.every((rule) => rule.test(user.password))` and `validatePassword(password)`. What if you want to add that the password must contain one special character? You don't actually know, by reading the name "validatePassword", if that work has already been done or not. You need to go read the function definition and check. And so even for such a small function, there is no name that you can choose to truly do what you claim and "remove the cognitive complexity from your brain".

Once a function gets to a certain threshold of size and reuse, I tend to agree with you. But I think that threshold is quite large - like at least 40-50 lines, or reused at least 3 times.

[-]

jagged-chisel 8 months ago

`validatePassword` is definitely clearer. When I'm getting familiar with a new codebase, I don't need to know how each thing happens, I need to know what happens. In fact, in your example, you've already abstracted testing against the rules. (AAMOF, the password length requirement should be another rule...)

Also with that example, that is about the limit I would tolerate in a conditional that decides the next step in this app. I just need to know "if the password is valid, go to $SCREEN; if not, turn on these messages." I don't care about a string of expressions, I care about what it's doing and I don't want to parse the code every time to figure out "oh, this is just password validation."

This is different from verifying that features are implemented - now I need to know the minutia.

[-]

strken 8 months ago

It is definitely not clearer to me. I have no idea what happens in the `validatePassword` function. Does it make a synchronous network call that will stop the world for half a second? Does it throw an exception, or return an error object? I will also have to search the rest of the code to see who else calls it, and potentially refactor those callers as well.

Any smaller function broken out of a larger function is (slightly) confusing. I find that a genuinely large function is much more confusing than two functions half the size, but a petite little function less than 20 lines long is not confusing to begin with.

[-]

hinkley 8 months ago

The business rules for passwords and usernames are separate. It's okay for them to be separate methods.

You also know that the 'valid password' function is going to list the rules for a valid password. If you get a task to change the password creation rules, do you honestly expect people other than you to remember that code is in the createUser function and not the valid password function??

I don't think you're being honest with yourself about this scenario. Or if you are, this is going to cause you a lot of trouble over time.

[-]

strken 8 months ago

> If you get a task to change the password creation rules, do you honestly expect people other than you to remember that code is in the createUser function and not the valid password function??

I don't expect anyone to remember where the code is, regardless of which function it's in. I don't even expect them have been aware of every function to begin with.

How do you expect people will find the `isPasswordValid` function? Because I know from experience that the way I would find it would likely be by looking at the `createUser` function to see what it does. I might do a case-insensitive grep for the word password first, but even then I'd just have to go look at `createUser` anyway, otherwise I wouldn't know whether `isValidPassword` was dead code or referring to a different password.

[-]

lurking_swe 8 months ago

how do you unit test this function that’s handling many responsibilities?

[-]

strken 8 months ago

Look at it. It's under 20 lines long and handles one thing, which is creating a user. You unit test it by passing a user with an invalid password. You might possibly decide to put the validation of user input into one function and avoid mocking userService, but I don't see a good justification for splitting password and other user fields into separate functions.

I accept that if there was more logic then you might want to split it out, but seriously, it's tiny. The single responsibility principle doesn't justify making your production code much more difficult to read just so the unit tests around it are marginally easier to write.

LorenPechtel 8 months ago

I used to feel that way, but over time I've come to realize that for the vast majority of code pulling stuff out so you have only one level of nesting and only one pass/fail decision makes it easier to understand years later. I'm sure the compiler inlines a lot of stuff I pulled out.

The exceptions to this are situations where you need to do a bunch of sequential steps and cases where you are looping on stuff in higher dimensions.

I have gotten to the point that I consider a comment about the code to be a sign of trouble. (Comments about what the code is interacting with are a different matter.)

dietr1ch 8 months ago

Abstraction is way better, I don't really want to know how the password is validated unless I know I'm facing issues with validation (which proper logging tells you about before you even dive into the code).

I don't understand why some people prefer being swarmed with details. It's not that they want details, but that they just hate navigating files (layer 8 + tooling problem) or that they "need" to know the details because not knowing them haunts them at night somehow. Also, not having that as a free function makes me think it's not tested at all (although there might be some integration test that hopefully catch all errors at once, but I'm sure they don't either)

[-]

hinkley 8 months ago

It's not abstraction it's separation of concerns. And conflating the two is part of why GP is getting wrapped around the axle here.

hinkley 8 months ago

He even knows what the missing function should be called:

> isPasswordValid

TrianguloY 8 months ago

My only nitpick is that the const isPasswordValid = ... should be just before its use (between the first two ifs). Other than that, I prefer this approach (although I would inline the booleans in the ifs to avoid the one-use variables. But that's ok).

> Don't use a bunch of tiny functions

Exactly this. I only do that when the function is used in more than 10 places and it provides some extra clarity (like something as clamp(minVal,val,maxVal){return max(minVal,min(val,maxVal))} if your language doesn't already have it, of course).

I also apply that to variables though, everything that is only used once is inlined unless it really helps (when you create a variable, you need to remember it in case it is used afterwards, which for me is a hard task)

[-]

xigoi 8 months ago

> My only nitpick is that the const isPasswordValid = ... should be just before its use (between the first two ifs).

Wouldn’t that cause the regexes to be recompiled every time you call the function?

[-]

TrianguloY 8 months ago

I don't think so, if it does it will now too. In fact with my suggestion the regex checks will not run if the user is not valid (as it is now it will always run)

    const isUserValid = ... 
    if(!isUserValid) ...

    const isPasswordValid = ...
    if(!isPasswordValid) ...

Etc

rjbwork 8 months ago

This is one of my pet peeves. I had an engineer recently wrap Dapper up in a bunch of functions. Like, the whole point of dapper to me is that it gets out of your way and lets you write very simple, declarative SQL database interactions and mapping. When you start wrapping it up in a bunch of function calls, it becomes opaque.

DRY has been taken as far too much gospel.

[-]

skydhash 8 months ago

I always prefer having loadBooks than “select * from books” everywhere. I prefer my code to be just the language version of how I would write an explanation if you ask me one specific question. Not for DRY, but for quick scanning and only diving into details when needed

[-]

rjbwork 8 months ago

Obviously there's a method called LoadBooks. The implementation of LoadBooks is what I'm talking about.

dclowd9901 8 months ago

Yep, I’d hire you, and not OOP. There was some really questionable code structuring in there and I always wince at abstraction for abstraction’s sake. It always becomes a hindrance when something goes wrong, which is exactly when you don’t want to have a hard time deciphering code.

eru 8 months ago

> 1. Don't use a bunch of tiny functions. This makes it harder for future eng to read the code because they have to keep jumping around the file(s) in order to understand control flow. It's much better to introduce a variable with a clear name.

If you have nested functions, that's not a problem.

Btw, why do you use regular expressions for some rules, but not for others? Regular expressions are perfectly capable of expressing the length requirement.

ilrwbwrkhv 8 months ago

Something about Javascript and Typescript is really ugly to my eyes. I think it is the large keywords at the beginning of every line. Makes it hard to parse and read. I find C++ style much better to read.

alilleybrinker 8 months ago

There is no such thing as universally self-documenting code, because self-documentation relies on an assumption of an audience — what that audience knows, what patterns are comfortable for them — that does not exist in general.

Self-documenting code can work in a single team, particularly a small team with strong norms and shared knowledge. Over time as that team drifts, the shared knowledge will weaken, and the "self-documenting" code will no longer be self-documenting to the new team members.

simonw 8 months ago

I don't find this easier to read:

    !(await userService.getUserByEmail(user.email)) || throwError(err.userExists);

I guess if I worked in a codebase that used that pattern consistently I'd get used to it pretty quickly, but if I dropped into a new codebase that I didn't work on often I'd take a little bit longer to figure out what was going on.

[-]

mega_dean 8 months ago

After that step, they say "The resulting code is shorter and has no nested logic." The resulting code has the same logic as before, it's just not visually represented as being nested. I've seen the same argument ("nesting is bad so indentation is a code smell") used to say that it's better to use early returns and omit the `else` block, eg:

    if (some_condition) {
      // do stuff here
      return;
    }
    // do other stuff here

is "better" than:

    if (some_condition) {
      // do stuff here
    } else {
      // do other stuff here
    }

If you have very-deeply nested code then it usually becomes easier to work with after splitting it up into smaller pieces. But IMO rewriting code like this to save a single level of indentation is bikeshedding.

[-]

jonathanlydall 8 months ago

I would say (as all good technical people know) it depends.

I have come to appreciate the style of early returns rather than else statements as I have found over the years it generally makes the code easier for me to follow when I’m looking at it possibly years later.

It really depends on the particular condition, but sometimes it just reads better to me to not use the else, and this is because as a style I tend to try have “fail conditions” cause an early return with a success being at the end of the method. But again there are regularly exceptions where trying to do this “just because” would contort the code, so returning an early success result happens often enough.

I have however found that sometimes ReSharper’s “avoid nesting” suggestion (particularly in examples like yours) results in less clear code, but it’s almost always at least not worse and maybe slightly better for the sake of consistency.

EDIT: Having thought about this more, here is why I find early returns generally easier to read than else statements.

With an early return the code is generally more linear to read as when I get to the end of the if block I can instantly see there is nothing else of relevance in the method, I save myself having to needlessly scan for the end of the else block, or even worse, past more code blocks only to find that the rest of the method’s code is irrelevant.

Again, not a hard rule, but a consistent style in a code base also generally makes it easier to read.

[-]

fenomas 8 months ago

It definitely depends, but personally I find early returns to be a bit of an antipattern IF they're based on business logic. If a function has lots of ifs, you can glance at the nesting to see which ones affect the line you want to edit, and ignore the others. But if the function has lots of returns, you have to check every one before a given line in order to know what constraints are true at that point.

OTOH early returns are great for anything the type checker knows about:

    function foo(msg: 'OK' | 'ERR') {
        // ...
        if (msg === 'ERR') return someValue
        // ...
    }

Doing that is hugely cleaner than branching, and there's no added complexity to the developer since tooling can easily tell you what values `msg` can have at any given point in the function.

[-]

kazinator 8 months ago

> you can glance at the nesting to see which ones affect the line you want to edit, and ignore the others.

Only if all the cases return! Only then is it obvious that you have independent cases. E.g. suppose we have three Boolean inputs x, y, z and want to do something for each binary combination:

  if (x) {
    if (y) {
      if (z) {
         return 7;
      } else {
         return 6;
      }
    } else { // x && !y
      if (z) {
         return 5;
      } else {
         return 4;
      }
    }
  } else { // !x
    if (y) { // !x && y
      if (z) {
         return 3;
      } else {
         return 2;
      }
    } else { // !x && !y
      if (z) {
         return 1;
      } else {
         return 0;
      }
    }    
  }

How would this look with early returns? One obvious way:

  if (x && y && z)
    return 7;
  if (x && y && !z)
    return 6;
  if (x && !y && z)
    return 5;
  // ... etc

(Let's ignore that we can just calculate the output with some bit twiddling; that's not the point).

Early return can be very clear, if we can bear repeatedly testing some conditions.

[-]

fenomas 8 months ago

> Only if all the cases return!

My comment was about using if blocks as opposed to early returns. I.e. where the nested ifs run exhaustively and return afterwards.

Also, obviously deep nested ifs aren't good, so I wasn't advocating them - I just think it's better to fix them by splitting functions or simplifying control flow, than by adding early returns.

[-]

kazinator 8 months ago

So it is more about multiple returns, versus setting some local return variable and returning in one place.

However, setting that return variable can be recognized as a simulated return. If we know that after "ret = 42", there are no other state changes; that the whole if/else mess will drop out to the end where there is a "return ret", then we can just read it as "return 42".

[-]

fenomas 8 months ago

Sure, in the narrow case where the function only calculates a single return value and has no side effects.

[-]

kazinator 8 months ago

Or where it produces an effect (or effect group) in every case just before returning, without multiple effects interspersed among multiple condition tests.

[-]

fenomas 8 months ago

That's isomorphic to what I said, so... also yes :D

jonathanlydall 8 months ago

Calling it an anti pattern (as opposed to a subjective preference) is in my opinion super dangerous and reeks of cargo-culting as it implies an active avoidance of it which can result in deep nesting or contorted and hard to read logic.

There is no hard rule about early returns being always good or always bad, it depends on the particular situation.

[-]

8 months ago

[deleted]

fenomas 8 months ago

> > It definitely depends, but personally I find..

> ..(as opposed to a subjective preference) is in my opinion super dangerous and reeks of cargo-culting..

o_O

aidos 8 months ago

I call them guard conditions and it’s my preferred pattern too. Deal with all the exceptional cases at the top and then move on to the happy path as a linear flow.

[-]

hollerith 8 months ago

And at the start of the happy path, I tend to put "Do it" as a comment (a convention I learned from reading Emacs Lisp code).

kazinator 8 months ago

Early returns are easy to read because although return is a staunchly imperative construct (a form of "go to"), early returns are structured such that they simulate a multi-case conditional from a functional language.

You know that each early return completely handles its respective case; if that branch is taken, that's it; the function has ended. There is only way to reach the code past the if/return, which is that the condition has to fail.

The conditionals inside a function that has a single return are harder to read, because no conditional is necessarily final.

if/elses that all return can be readable:

  if (this) {
    if (that) {
      return x;
    } else {
      return y;
    }
  } else {
    return z;
  }

still, it can be flattened:

  if (!this)
    return z;
  if (that)
    return x;
  return y;

It's shorter and less nested, and so that's a readability improvement. It's not as easy to see that x is returned when both this and that hold. The intermediate version below helps with that:

  if (this) {
    if (that)
      return x;
    return y;
  }

  return z;

If the conditions are cheaply tested variables, or simple expressions easily optimized by the compiler, there is also:

  if (this && that)
    return x;
  if (this)
    return y;
  return z;

jonhohle 8 months ago

I usually find the opposite (personally). Get rid of all the exceptional cases and error handling up front like you have in the first example and then spend the remaining body of the function doing the main work of that function.

It’s not so much indentation that’s an issue, but coupling control flow with errors and exceptions.

Swift does a nice job with `guard` statements that basically bake this in at the language level - a condition succeeds or you must return or throw.

If that control flow is part of business logic, I don’t think there’s any issue with your second example. That’s what it’s there for.

RaftPeople 8 months ago

> I don't find this easier to read:

I agree. The previous iteration shown is simpler IMO.

I've really shifted how I code to making things just plain simple to look at and understand.

[-]

f1yght 8 months ago

That's the way it should be, easy to understand. This set up might be short but it's complex to read.

amonith 8 months ago

It could be "dangerous" even sometimes if you're not paying attention. In JS/TS "||" operator evaluates the right side when the left side is "falsy". "Falsy" doesn't mean only null/undefined, but also "", 0, NaN, and... well... false. So if you make a method like "isUserActive" or "getAccountBalance" and do a throw like that, you'll get an error for valid use cases.

[-]

jay_kyburz 8 months ago

Whats more, the isUserValid function can't return a more detailed error about what was not valid about the User. It can only return falsy.

LorenPechtel 8 months ago

Agreed. I do not like that line at all. I might take that approach but if I did it would be a separate IsDataValid function that checked things, one condition per line. (Might be a string of ||s, but never run together like that.) As much as possible I want one line to do one thing only.

Chris_Newton 8 months ago

If I were reviewing the original code, the first thing I’d question is the line

    user.password = await hashPassword(user.password);

1. As a rule, mutations are harder to understand than giving new names to newly defined values.

2. The mutation here apparently modifies an object passed into the function, which is a side effect that callers might not expect after the function returns.

3. The mutation here apparently changes whether user.password holds a safe hashed password or a dangerous plain text password, which are bad values to risk mixing up later.

4. It’s not immediately obvious why hashing a password should be an asynchronous operation, but there’s nothing here to tell the reader why we need to await its result.

At least three of those problems could trivially be avoided by naming the result hashedPassword and, ideally, using TypeScript to ensure that mixing up plain text and hashed passwords generates a type error at build time.

I do agree with many of the other comments here as well. However, I think the above is more serious, because it actually risks the program behaving incorrectly in various ways. Questions like whether to use guard clauses or extract the password check into its own function are more subjective, as long as the code is written clearly and correctly whichever choices are made.

[-]

tetha 8 months ago

> At least three of those problems could trivially be avoided by naming the result hashedPassword and, ideally, using TypeScript to ensure that mixing up plain text and hashed passwords generates a type error at build time.

Going that path further ends up what a few code bases I've worked with do: Pull the two domains apart into a "UserBeingCreated" and an existing "User".

This felt a bit weird at first, but the more I think about it, the more sense it makes. One point leaning towards this: You are dealing with different trust levels. One is a registered and hopefully somewhat validated user, which can be trusted a bit. The other thing could just be a drive by registration attempt.

And you're dealing with different properties. Sure, there is some overlap - username, mail, firstname, lastname. But only a UserBeingCreated needs validation errors or a clear text password. Other things - like groups, roles and other domain properties only make sense after the user is properly registered.

[-]

Chris_Newton 8 months ago

I’ve had this debate a few times too. Personally I am in the camp that says you’re talking about two interfaces — your external UI or API, and your internal database schema — so even though you’ll often have a lot of overlap between types representing analogous entities in those two interfaces, they aren’t really the same concept and coding as if they will or should always have identical representations is a trap. I would almost always prefer to define distinct types and explicit conversion between them, even though it’s somewhat more verbose, and the password hashing here is a good example of why.

I wrote a more about this in a Reddit post a while back if anyone’s interested: https://www.reddit.com/r/Python/comments/16w97i6/flask_300_r...

jcparkyn 8 months ago

Agreed, and then there's the time of check/time of use issue with creating a user. Probably not a vulnerability if userService is designed well, but still a bit dubious.

[-]

Chris_Newton 8 months ago

You’re right, that’s potentially a correctness issue as well. Ideally we’d have a creation interface that would also perform the pre-existence check atomically, so there would be no need for the separate check in advance and the potential race condition would not exist. This does depend on the user service providing a convenient interface like that, though, and alas we aren’t always that lucky.

cjfd 8 months ago

Typescript looks much, much better than what he ends up with. The typescript is more or less the same thing but with comment tokens removed. How is just removing the comment tokens not an obvious improvement in readability?

Honestly, I think all of jsdoc, pydoc, javadoc, doxygen is stuff that most code should not use. The only code that should use these is code for libraries and for functions that are used by hundreds or thousands of other people. And then we also need to notice that these docs in comments are not sufficient for documentation either. When a function is not used by hundreds or thousands of people, just write a conventional comment or perhaps not write a comment at all if the function is quite straightforward. Documentation that explains the big picture is much more important but that is actually somewhat hard to write compared to sprinkling jsdoc, pydoc, javadoc or doxygen worthless shit all over the place.

[-]

8 months ago

[deleted]

joecarrot 8 months ago

If one of my developers used "||" that way I would definitely throw some side eye

[-]

JellyBeanThief 8 months ago

I was thinking exactly the same. You can write

    if (cond) { cons }

on one line and get more readable code admittedly a few chars longer.

[-]

65 8 months ago

Don't even need the curly braces. I do

    if (cond) doSomething();

all the time.

[-]

gnarlouse 8 months ago

if (heathen) dontUseCurlies();

[-]

Groxx 8 months ago

    if (dontUseCurlies):
      goto fail
      goto fail

[-]

gnarlouse 8 months ago

go to jail

alilleybrinker 8 months ago

Code patterns are social! What is strange to one is normal to another.

The kind of pattern used here with the `||` might seem weird to some JavaScript developers, but it's pretty normal in shell scripts, and it's pretty normal in Ruby with `unless`!

[-]

mmastrac 8 months ago

`||` is conventional in bash, but definitely not in JS. `x || value` is closer to conventional JS, but using it for control flow is certainly not common and a stylistic choice, for sure.

[-]

8 months ago

[deleted]

shiroiushi 8 months ago

>The kind of pattern used here with the `||` might seem weird to some JavaScript developers, but it's pretty normal in shell scripts

Shell scripts are NOT known for being easy to read. They're full of obscure and sometimes frankly bizarre, arcane syntax that newcomers would have no idea about. Quick, what does "$#" mean? An experienced bash programmer would know, but no one else would. Shell scripts were never meant to be easy to read; they're just an extension of the shell syntax, and of course vary a lot from shell to shell (e.g. bash vs zsh vs ksh etc.).

MetaWhirledPeas 8 months ago

I must be their target audience because as soon as they used the example with || it all started making sense.

This would have been fine too but it would trigger some people not to use {}

    if (!validateUserInput(user)) throwError(err.userValidationFailed);

My preferred style might be closer to this.

    if (!userInputIsValid(user)) throwError(err.userValidationFailed);

gnarlouse 8 months ago

If one of my developers threw in a throw like that I would throw up in their mouth.

dvt 8 months ago

The writer here misunderstands how short-circuit evaluation is supposed to be used. The idea is that you should use SCE in a few, pretty standard, cases:

    cheapFunction(...) || expensiveFunction(...) // saves us a few cylces
    car = car || "bmw" // setting default values, common pattern
    funcA(...) && funcB_WhichMightBreakWithoutFuncA(...) // func A implies func B
    ...
    // probably a few other cases I don't remember

Using it to handle control flow (e.g. throwing exceptions, as a makeshift if-then, etc.) is a recipe for disaster.

[-]

trealira 8 months ago

Short-circuiting evaluation is also useful for things like this:

  function insertion_sort(a) {
      for (let i = 1; i < a.length; i++) {
          let key = a[i];
          let j = i;
          while (j > 0 && key < a[j-1]) {
              a[j] = a[j-1];
              j--;
          }
          a[j] = key;
      }
  }

If short circuit evaluation didn't exist, then "key < a[j - 1]" would be evaluated even in the case where j = 0, leading to the array being indexed out of bounds.

38 8 months ago

I would go further to say that syntax should never be used. for example with Go:

> cheapFunction(...) || expensiveFunction(...)

is not valid unless both functions return bool

> car = car || "bmw"

is not valid at all, because both types would need to be bool

> funcA(...) && funcB_WhichMightBreakWithoutFuncA(...)

not valid unless functions return bool. I think Go smartly realized this syntax is just sugar that causes more problems than it solves.

[-]

lolinder 8 months ago

This has nothing to do with syntax and short circuiting and everything to do with Go's type system. Go, like most compiled languages, has no concept of "truthiness". JavaScript is not Go and has truthiness.

We can debate the merits of truthiness and using it this way, but let's have that debate on the merits, not by invoking other languages with completely different design constraints.

Your argument here is similar to what got us "no split infinitives" in English (grammarians wanted English to be more like Latin).

[-]

38 8 months ago

just because a footgun exists, doesn't mean you should use it

[-]

lolinder 8 months ago

Then let's talk—with specifics!—about why it's a footgun and we shouldn't use it.

"Because Go doesn't support it" would also be a reason to avoid:

* Generics (at least until recently)

* Classes

* Prototypes

* Exceptions

* Async/Await

* Package managers (until recently)

* Algebraic data types

* Effect types

* Type inference

* Type classes

* Etc.

You could argue that one or more of these are footguns, but I seriously doubt you'd consider them all to be, so let's talk about what separates the footgun from the feature that just didn't fit in Go's design.

[-]

38 8 months ago

This isn't about Go. This is about a language construct (&& and ||) that I would argue is terrible and should never be used. Its mainly just sugar, and in my experience code heavy on these is harder to read and write. Couple that with truthiness and it's even worse.

[-]

trealira 8 months ago

> This is about a language construct (&& and ||) that I would argue is terrible and should never be used.

Something tells me you'd hate my five-line implementation of the standard library function fgets in C.

  char *fgets(char *buf, size_t n, FILE *fp) {
      char *p = buf;
      int c = 0;
      while (n > 1 && (c = getc(fp)) != EOF && (*p++ = c) != '\n')
          n--;
      return n > 0 && (c != EOF || feof(fp) && p != buf) ? *p = '\0', buf : 0;
  }

I'm joking; I'd never write code this way in earnest.

variadix 8 months ago

Types are the best form of documentation because they can be used to automatically check for user error, are integral to the code itself, and can provide inline documentation. The more I program in dynamically typed (or even weakly statically typed) languages the more I come to this conclusion.

[-]

einpoklum 8 months ago

You're not wrong, but if your code is full of bespoke types that are only relevant to a couple of places where they're used, you hurt interoperability; you lock yourself in to how things are done now; and you may just be shifting the burden of making sense to someplace else in the code.

If you are able to formulate types which can be grasped without reading and re-reading their code - that's a win; and if you are able to express more of your code in terms of more generic non-trivial types and data structures, which are language-idiomatic - then that's a double win, because you're likely to be able to apply library functions directly, write less code, and have your code readers thinking "Oh, variadix is just doing FOO on the BAR structure, I know what that means".

[-]

callc 8 months ago

> if your code is full of bespoke types that are only relevant to a couple of places where they're used, you hurt interoperability; you lock yourself in to how things are done now; and you may just be shifting the burden of making sense to someplace else in the code

What is an example of bespoke types? Is is all compound types (structs, classes)? If you need interop or extensibility, make an API. Feel free to use whatever types in your library you want, just make sure to agree to the API interface.

I’m honestly not sure what style of programming you are advocating for einpoklum. Can you provide an example?

All non-primitive types are essentially n-ary trees of types of sub-members (ignoring unions) with primitive types as leaves. Passing your non-primitive type so some external library function is accomplished by the library declaring that the type it receives must adhere to an interface.

[-]

einpoklum 8 months ago

> Is is all compound types (structs, classes)?

Well, plain-old-data structs are not something I would call bespoke types. Classes for which you write code, I suppose.

> All non-primitive types are essentially

If you only use plain structs, then maybe.

hinkley 8 months ago

> types that are only relevant to a couple of places

do not create architectural lock-in. Friction in type declaration comes from too many cooks in the kitchen. If only two chunks of code see a type, just change it.

0xbadcafebee 8 months ago

"Self-documenting code" is already a thing called Code-as-Docs. It's the inverse of Docs-as-Code, where you're "writing documentation like you write code". Code-as-Docs is where you write Code that is self-documenting. (And this has absolutely nothing to do with Literate Programming.)

You do not have to adhere to any specific principles or methods or anything specific in order to do Code-as-Docs. Just write your code in a way that explains what it is doing, so that you don't need comments to understand it.

This often means refactoring your code to make it clearer what it does. It may not be what your ideal engineer brain wants the code to do, but it will make much more sense to anyone maintaining it. Plus very simple things like "variables-that-actually-describe-what-they-do" (in a loop over node names, don't make a variable called x; make a variable called node_name)

edit It seems like I'm the only one who says "Code-as-docs"... by searching for "Code-as-documentation" instead of "Code-as-docs", I found this: https://martinfowler.com/bliki/CodeAsDocumentation.html

I guess "self-documenting code" more hits: https://www.google.com/search?q=self-documenting+code https://en.wikipedia.org/wiki/Self-documenting_code https://wiki.c2.com/?SelfDocumentingCode

[-]

Izkata 8 months ago

I think "self-documenting code" is older than those other two short terms. I, at least, don't think I've ever heard of them, but I was aware of self-documenting code around 20 years ago in school.

amonith 8 months ago

After 10 years as a commercial dev I've noticed I don't really care about things like this. Not sure if it ever made a difference. The "local code" - as in anything within a function or often a single class (1-2k LoC is not really a problem) - is trivial to read in most languages. The most difficult thing to understand always was the domain or the infrastructure/library quirks - stuff that's never properly documented. (Hot take: might not be worth to document anyway as it takes longer to write and update such docs than to struggle with the code for a little bit).

Naming or visual code structure was never a problem in my career so far.

[-]

hinkley 8 months ago

> Naming or visual code structure was never a problem in my career so far.

Either you're the high verbal skill person on the project and you haven't noticed yet that everyone keeps coming to you to name things, you're in a pocket of devs with high verbal skills so you don't see problems, or you're in an echo chamber where everyone is equally bad and you don't know what 'good' looks like.

There's literally a programmer joke about how hard naming things is. If you don't understand a programmer joke you should probably pull on that thread real hard to figure out why.

[-]

amonith 8 months ago

> everyone keeps coming to you to name things

Nah, not really. Years ago when we worked in office and I was still a junior we did discuss naming things here and there but I didn't like those conversations. It's all just opinions and habits.

> you're in a pocket of devs with high verbal skills

I mean, we don't hire juniors so it's possible but doubt it.

> you're in an echo chamber where everyone is equally bad

Most probably but if no one is having problems with reading code then I'm not sure what bad even means. If it's in English and it describes what it does and does not use any tricks - it absolutely doesn't matter how exactly is it worded and structured. The examples from the article are especially bad. I've seen some code I'd consider bad in general but it was all about those library hacks (usually ORMs) I mentioned previously or about trying to fit a lot in a single LINQ statement (i'm a c# dev). The only time when I'd consider naming to be bad was were all variable and function names were in German.

> There's literally a programmer joke about how hard naming things is.

It's a joke because it's not that serious. It might be hard sometimes to come up with any name for a thing (especially for non-english speakers like me) and you plop down a "var x" or something even less mature and run which is funny. So it is hard but it's not a "real" problem.

reportgunner 8 months ago

> There's literally a programmer joke about how hard naming things is. If you don't understand a programmer joke you should probably pull on that thread real hard to figure out why.

Those are for people who are new to programming.

tln 8 months ago

I find the comment at the end interesting

// Creates a user and returns the newly created user's id on success

Hmm, it returns an id? But the @returns is Promise<any>? The code as written will change when userService.create changes... without the actual, human readable bit of prose, that potential code issue could be easily overlooked.

Of course, here the code could have a newtype for UserId and return Promise<UserId>, making the code better and then the prose is basically not needed (but please just write a docstring).

FWIW I would document that the `user` parameter is modified. And document the potential race condition between checking the existence of a user and creating a user, and maybe why it was chosen to be done in this order (kinda flimsy in this example). Which would probably lead me to designing around these issues.

Trying to only document via self-documenting code seems to always omit nuances.

    /** Create a user and return the id, or throw an error with an appropriate code.
     * 
     * user.password may be changed after this function is called.
     */
    async function createUser(user: User): Promise<number> {
        if (!validateUserInput(user)) {
            throw new Error(err.userValidationFailed);
        }

        if (isPasswordValid(user.password)) {
            // Check now if the user exists, so we can throw an error before hashing the password.
            // Note: if a user is created in the short time between this check and the actual creation, 
            // there could be an unfriendly error
            const userExists = !!(await userService.getUserByEmail(user.email));
            if (userExists) {
                throw new Error(err.userExists);
            }
        } else {
            throw new Error(err.invalidPassword);
        }

        user.password = await hashPassword(user.password);
        return userService.create(user);
    }

[-]

Savageman 8 months ago

This, but move isPasswordValid below if (!isUserValid)

gnarlouse 8 months ago

Having a function throwError makes me squirm.

`isValid() || throwError()` is an abuse of abstraction

[-]

t-writescode 8 months ago

A fail fast paradigm is a style of programming that can be used very effectively, as long as it's understood that's the style of code that is being written.

Much of my code, for example, is fail fast, and then I have error handling, supervisors, etc, at a level that can log and restart the work.

[-]

Izkata 8 months ago

Not in anything touching user or password validation. It's probably fairly safe the way it's used in this function, but I'd avoid it as a rule of thumb to not accidentally introduce some sort of timing attack.

I'm a little disappointed no one mentioned this even as a side comment, since there's a whole section on doing short-circuit evaluation, even if it's not in a way that would cause this kind of problem.

jnsie 8 months ago

I lived in the C# world for a while and our style guides mandated that we use those JSDoc style comments for every function definition. I loathed them. They invariable became a more verbose and completely redundant version of the function definition. Developers even used a tool (GhostDoc, IIRC) to generate these comments so that CreateNewUser() became // Create New User. Nobody ever read them, few ever updated them, and they reinforced my hunch that a small percentage of comments are useful (in which case, by all means, use comments!)

[-]

einpoklum 8 months ago

I'm a library author and maintainer, in C++ rather than C#. I know what you mean about the redundancy, and my solution to that - which, of course, requires management approval in your case - is the following:

1. It is legitimate to skip parts of a doxygen/JSDoc comment (return value, parameter description) if it is _entirely_ trivial. But:

2. You must document the _why_. Well-written function signatures can tell you the "what", but they usually not tell you the "why".

3. You must make an effort to squeeze that non-triviality of a documented aspect. So, you mention corner-case behavior; you relate the use of a parameter to its use elsewhere; and of course - you describe "why"'s: Why that parameter is _not_ of another type, or why it can't be avoided.

I guarantee you, that if you follow rules (2.) and (3.), you will usually not get to apply (1.); and at the same time, your doxygen comments will stop being annoying boilerplate that you automatically gloss over, because you would be much more likely to find useful information there.

[-]

rerdavies 8 months ago

Could you provide an example of a function or argument type that needs a 'why' explanation. Not getting it.

[-]

skydhash 8 months ago

Not GP, but here is an example of one: https://git.sr.ht/~delthas/senpai/tree/master/item/ui/buffer...

It’s mostly done to mark contentious, complicated, or good enough implementations because after reading the code your first reflex would be: Why is it done that way?

Also personally I mark part of the code with short What? or How? comments for quicker scanning especially if the syntax is terse or it’s not immediately obvious. Kinda like a table of contents.

[-]

rerdavies 8 months ago

lol. Fabulous comment!

I entirely understand the use of why comments in the body of code.

I was more curious about why GP thought that functions and parameters should always have why annotations (and only have why annotations). Which seems a bit monstrous, unless I'm misunderstanding something.

gtirloni 8 months ago

Is writing a few comments here and there explaining why things are done in a certain way so terrible that we have to create this thing?

[-]

hinkley 8 months ago

Every time you write documentation, consider if it would be less time and energy to fix the problem you're describing instead of apologizing for it.

Philosophical differences between your code and a library you're using are a good example of when you explain rather than fix. But if it's just two chunks of your own code it might just be faster in the long run to make the problem go away instead of having to explain repeatedly why it's there at random intervals. People underestimate the cost of interruptions, and load up their future with self-imposed interruptions.

mannyv 8 months ago

Code only tells you 'what,' not 'why.' And 'why' is usually what matters.

[-]

Vinnl 8 months ago

https://github.com/reactjs/react.dev/issues/3896

(I do generally agree with your point.)

[-]

hinkley 8 months ago

I feel dumber for having read that.

mmastrac 8 months ago

I've been developing for a very long time and I'm neither on the side of "lots of comments" or "all code should speak for itself".

My philosophy is that comments should be used for two things: 1) to explain code that is not obvious at first glance, and 2) to explain the rationale or humanitarian reasons behind a bit of code that is understandable, but the reasons for its existence are unclear.

No philosophy is perfect, but I find that it strikes a good balance between maintainability of comment and code pairing and me being able to understand what a file does when I come back to it a year later.

The article is not good IMO. They have a perfect example of a function that could actually make use of further comments, or a refactoring to make this more self-documenting:

  function isPasswordValid(password) {
    const rules = [/[a-z]{1,}/, /[A-Z]{1,}/, /[0-9]{1,}/, /\W{1,}/];
    return password.length >= 8 && rules.every((rule) => rule.test(password));
  }

Uncommented regular expressions are a code smell. While these are simple, the code could be more empathetic to the reader by adding at least a basic comment:

  function isPasswordValid(password) {
    // At least one lowercase, one uppercase, one number and one symbol
    const rules = [/[a-z]{1,}/, /[A-Z]{1,}/, /[0-9]{1,}/, /\W{1,}/];
    return password.length >= 8 && rules.every((rule) => rule.test(password));
  }

Which would then identify the potentially problematic use of \W (ie: "[^a-zA-Z0-9]"). And even though I've been writing regular expressions for 20+ years, I still stumble a bit on character classes. I'm likely not the only one.

Now you can actually make this function self-documenting and a bit more maintainable with a tiny bit more work:

  // Returns either "true" or a string with the failing rule name.
  // This return value is kind of awkward.
  function isPasswordValid(password) {
    // Follow the password guidelines by WebSecuritySpec 2021
    const rules = [
      [MIN_LENGTH, /.{8,}/],
      [AT_LEAST_ONE_LOWERCASE, /[a-z]{1,}/],
      [AT_LEAST_ONE_UPPERCASE, /[A-Z]{1,}/],
      [AT_LEAST_ONE_NUMBER, /[0-9]{1,}/],
      // This will also allow spaces or other weird characters but we decided
      // that's an OK tradeoff.
      [AT_LEAST_ONE_SYMBOL, /\W{1,}/],
    ];

    for (const [ruleName, regex] of rules) {
      if (!regex.test(password)) {
        return ruleName;
      }
    }

    return true;
  }

You'd probably want to improve the return types of this function if you were actually using in production, but this function at least now has a clear mapping of "unclear code" to "english description" and notes for any bits that are possibly not clear, or are justifications for why this code might technically have some warts.

I'm not saying I'd write this code like this -- there's a lot of other ways to write it as well, with many just as good or better with different tradeoffs.

There are lots of ways to make code more readable, and it's more art than science. Types are a massive improvement and JSDoc is so understandably awkward to us.

Your goal when writing code shouldn't be to solve it in the cleverest way, but rather the clearest way. In some cases, a clever solution with a comment can be the clearest. In other cases, it's better to be verbose so that you or someone else can revisit the code in a year and make changes to it. Having the correct number of comments so that they add clarity to code without having too many that they become easily outdated or are redundant is part of this as well.

[-]

tpmoney 8 months ago

> Having the correct number of comments so that they add clarity to code without having too many that they become easily outdated or are redundant is part of this as well.

In fact, I would argue its not only part of it, it's an essential part. I've worked on projects that were written by folks who were very into the "the source code is the documentation" and comments were extremely minimal. Which was fine for the most part, they did a pretty good job at naming things. The problem was what happened when you were trying to debug a problem. The problems usually start looking like this:

"I see ${function} code does X, and that's causing Y which ultimately leads to bug Z, is ${function} supposed to do X, and we should be checking the condition at another level where Y happens or should ${function} not be doing X and that's the root bug?". And no one knows the answer because there's no documentation about what the function is SUPPOSED to do. I view actual comments and documentation about code as the "parity bit" equivalent for CRC checks / RAID 5 storage. If the code and the documentation agree, great now we can have a discussion about whether the code needs to change. If the code and the documentation disagree, then the code is wrong just like if I get data and a parity that don't match, the data is wrong. Your example of the "AT_LEAST_ONE_SYMBOL" with the associated comment is a perfect example of why this is critical. If some months down the line spaces in the password is causing a problem, I now know the code is supposed to be allowing spaces and we know we need to fix the part that's choking on spaces, not the validation rule.

Yes it's possible the documentation/parity is wrong, but in either case it calls out a large warning that we're out of sync at this point, and need to get in sync before we can continue.

hinkley 8 months ago

When I put a hack in place as a workaround for a bug in a dependency, I always link the ticket for the dependency in the PR and the commit comment. If the code looks like someone held a gun to my head or I was on drugs when I wrote it, I'll also put it inline in a comment above the workaround.

Not only to explain why this stupid code is here, but to give the next person permission to delete it if the bug has been fixed upstream and we have pulled the new version down.

LorenPechtel 8 months ago

Argh! While I agree 100% with your thoughts here I find how you wrote the list of rules to be very hard to read. And I don't like the comment in the middle of the rules, move it above the declaration.

Regex is about the opposite of self documenting as it comes, anything non-trivial needs an explanation!

[-]

mmastrac 8 months ago

Well yeah, that's why I said it was "art" :). It's very subjective what's readable and what isn't, and putting code on HN without syntax highlighting (and with the ironic difficulty of editing code here) is one of hardest ways to communicate.

Mikhail_Edoshin 8 months ago

Names do have some importance. If you pick random words and assign them to things you deal with you will find yourself unable to reason about them. Try it, it is interesting. Yet names are not the pinnacle of design. Far from it.

Look at a mechanical watch. (For example, here: https://ciechanow.ski/mechanical-watch/). Those little details, can you come up with self-documenting names for them? I do not think so. In programming good design is very much like that watch: it has lots of strangely-looking things that are of that shape because it fits their purpose [1]. There is no way to give them some presumably short labels that explain that purpose out of the context. Yet we need to point to them as we talk about them [2]. The role of names in programming is thus much more modest. In the order of importance:

- They must be distinct within the context (of course). - Yet their form must indicate the similarities between them: alen and blen are of the same kind and are distinct from abuf and bbuf, which are also of the same kind. - They must be pronounceable and reasonably short. Ideally they should be of the same length. - They need to have some semblance to the thing they represent. - It would be nice to make them consistent across different contexts. Yet this is is incredibly tedious task of exponential complexity.

There is also the overall notation. Ideally it should resemble written reasoning that follows some formal structure. None of existing notations is like that. The expressive tools in these notations are not meant to specify reasoning: they are meant to specify the work of a real or virtual machine of some kind. The fallacy of self-documenting code is an unrecognized desire to somehow reason with the knobs of that machine. It will not work this way. Yet a two-step process would work just fine: first you reason, then you implement this on the machine. But it will not look self-documenting, of course. P. S. This is a major problem in programming: we keep the code, but do not keep the reasoning that led to it.

[1] fitness for the purpose, Christopher Alexander, “The timeless way of building”. [2] notion vs definition, Evald Ilyenkov.

virgilp 8 months ago

missed the opportunity to create named constants for each of the password validation rules.

sesteel 8 months ago

Looking at this thread, it is a wonder that any PRs make it through review. I started calling these kinds of debates Holographic Problems.

- Spaces vs Tabs

- Self documenting code vs documented code

- Error Codes vs Exceptions

- Monolithic vs Microservices Architectures

- etc.

Context matters and your context should probably drive your decisions, not your personal ideology. In other words, be the real kind of agile; stay flexible and change what needs to be changed as newly found information dictates.

Aeolun 8 months ago

> Short-circuit evaluation allows us to simplify conditional statements by using logical operators.

Simple is not the same thing as understandable.

They lost me entirely here.

Izkata 8 months ago

I'd like to propose weird alternative to this:

  function throwError(error) {
      throw new Error(error);
  }
  
  async function createUser(user) {
      validateUserInput(user) || throwError(err.userValidationFailed);
      isPasswordValid(user.password) || throwError(err.invalidPassword);
      !(await userService.getUserByEmail(user.email)) || throwError(err.userExists);

What if...

      [
          [() => validateUserInput(user), err.userValidationFailed],
          [() => isPasswordValid(user.password), err.invalidPassword],
          [() => !(await userService.getUserByEmail(user.email)), err.userExists],
      ].forEach(function([is_good, error]) {
          if (!is_good()) {
              throw new Error(error);
          }
      });

Also on the regex:

  const rules = [/[a-z]{1,}/, /[A-Z]{1,}/, /[0-9]{1,}/, /\W{1,}/];

No one caught that in all four of these, "{1,}" could be replaced with the much more common "+". A bit odd considering the desire for brevity. I do personally prefer "[0-9]" over "\d", especially considering the other rules, but can go either way on "\W".

I might have also added a fifth regex for length though, instead of doing it differently, if my head was in that mode: /.{8,}/

[-]

110jawefopiwa 8 months ago

I assume this is satire, but for those who might take this seriously, please avoid doing tricks like this.

You're doing so much extra work here. Creating many new arrays, running a bunch of extra function calls, creating extra closures, and really obfuscating code from the engine. This will tank performance.

This is the point at which people come back at me with something about "premature optimization" being bad. That's all well and good, but if you prematurely pessimize and let these patterns creep throughout your codebase, you end up with products that are significantly slower than they should be.

I've spent quite a while working on JS engines, and it always impresses me how much extra work exists in JS developers' code, seemingly for no real reason, and it's slowing down the entire internet. This doesn't appear to be better for the developer, the user, or any potential future maintainers.

jansommer 8 months ago

> The first change I would make is to use named constants instead of cryptic error codes.

But he keeps the cryptic error codes that will go into the logs, or in the frontend where the developer will have to look up the error code. Don't map an error name to u105, just return the actual string: "userValidationFailed".

G_o_D 8 months ago

function isPasswordValid(password) { return /^(?=.[a-z])(?=.[A-Z])(?=.[0-9])(?=.\W).{8,}$/.test(password); }

function isPasswordValid(password) { const issues = []; if (password.length < 8) issues.push("Minimum length is 8 characters"); if (!/[a-z]/.test(password)) issues.push("Must contain at least one lowercase letter"); if (!/[A-Z]/.test(password)) issues.push("Must contain at least one uppercase letter"); if (!/[0-9]/.test(password)) issues.push("Must contain at least one digit"); if (!/\W/.test(password)) issues.push("Must contain at least one special character"); return issues.length > 0 ? issues : ["Password is valid"]; }

reportgunner 8 months ago

I don't like this article, author just added some abstraction and moved the stuff that matters out of the perspective and we just have to imagine that whatever is outside the perspective is perfect.

tehologist 8 months ago

All source code is self documenting, source code is for developers to read. Computer languages are a human readable version of what the compiler executes and is for the developers, not the compilers benefit. As a developer, I read way more software than I write and if the source is hard to understand then I feel you failed as a developer. Writing is a skill, no less important in software than anywhere else. Properly named functions/variables and easy to follow flow control is a skill that takes years to learn. All developers should keep a thesaurus and a dictionary nearby. If you find yourself writing a lot of comments trying to explain what you are doing in your code, then you probably should refactor.

gspencley 8 months ago

I agree with most of the article but want to nitpick this last part:

> I’m not a fan of TypeScript, but I appreciate its ability to perform static type checks. Fortunately, there’s a way to add static type checking to JavaScript using only JSDoc comments.

If you're writing JSDoc comments, then you're not writing what the author considers to be "self-documenting code."

I wish the author had explained why they are not a fan of TypeScript. Compile time type-safety aside, as the author acknowledges by implication adding type specificity negates the usefulness of JSDoc comments for this particular situation.

I'm personally a big proponent of "self documenting code" but I usually word it as "code that serves as its own documentation because it reads clearly."

Beyond "I would personally use TypeScript to solve that problem", my case for why ALL comments are a code smell (including JSDoc comments, and in my personal opinion) is:

- They are part of your code, and so they need to be maintained just like the rest of your code

- But ... they are "psychologically invisible" to the majority of developers. Our IDEs tend to gray them out by default etc. No one reads them.

- Therefore, comments can become out of sync with the code quite easily.

- Comments are often used to explain what confusing code does. Which means that instead of fixing the code to add clarity, they do nothing but shine a spotlight on the fact that the code is confusing.

- In doing the above, they make messy code even messier.

I am slightly amenable to the idea that a good comment is one that explains WHY weird code is weird. Even then, if you have the luxury of writing greenfield code, and you still need to do something un-intuitive or weird for really good reasons ... you can still write code that explains the "why" through good naming and separation of concerns.

The only time that I would concede that a code comment was the best way to go about things in context is when you're working with a very large, legacy commercial code-base that is plagued by existing tech debt and you have no good options other than to do your weird thing inline and explain why for logistical and business reasons. Maybe the refactor would be way too risky and the system is not under test, the business has its objectives and there's just no way that you can reasonably refactor in time etc. This happens... but professional developers should ideally treat incremental refactoring as a routine part of the development lifecycle so that this situation is as unlikely as possible to arise in the future.

[-]

mjlawson 8 months ago

I find myself agreeing with much of your point, but I feel the need to nitpick a bit of your comment myself :)

I don't think your code base needs to be very large, or very legacy in order for comments to be valuable or even the best way forward. If the decision exists between a somewhat large refactor or a one-off comment to account for an edge case, I'm likely to take the latter approach every time. Refactors introduce risk, add time, and can easily introduce accidental complexity (ie: an overengineered solution). Now once that edge case becomes more common, or if you find yourself adding different permutations, yeah I agree that an incremental refactor is probably warranted.

That said, perhaps that comment could — and certainly one should at least supplement it — be replaced with a unit test, but I don't think its presence harms anything.

subversive-dev 8 months ago

The picture at the top of article is of a German public library. It has a beautiful, clean architecture. Somehow I don't see how it relates to self-documenting code.

[-]

reportgunner 8 months ago

Documents are a bunch of text and library is a bunch of text. /s

dgeiser13 8 months ago

If future "AI" can write code then it should be able to read code and describe what it does at various levels of detail.

kitd 8 months ago

I anticipate the day when Gen AI gives us self-coding documentation.

[-]

amonith 8 months ago

We created precise artificial languages to avoid programming in natural languages which would absolutely suck due to never ending ambiguity. I hope that what you're hoping never happens :V It would mean the docs would have to turn into "legalese" written basically by "code lawyers" - incomprehensible for the rest of the team. They would have to write multiple paragraphs for simple things like a button on a page just so that AI makes precisely what people want - right placement, right color, right size, right action, right feedback. And they would have to invent a new system for writing references to previously defined things just to keep some maintainability. And so many more issues...

omgJustTest 8 months ago

What if the documentation were code? ...

rednafi 8 months ago

[dead]