Quantcast
Channel: User Lundin - Code Review Stack Exchange
Viewing all articles
Browse latest Browse all 42

Answer by Lundin for Remove all comments from a C program

$
0
0

First a classic bug: char cur_char; ... (cur_char = getchar()) != EOF. The variable must be int not char or you can't compare it with EOF. Yeah it's really stupid that getchar gets an int, not a char, but that's how it is.


I know this is just a simple program and performance, maintainability etc isn't import. If it was a real production quality program though, it would preferably be written differently. For the sake of learning, lets pretend it is:

Then overall, you could be checking against a look-up table rather than by using a complex series of if-else if. They are kind of hard to read, you get the various different behavior upon finding certain comment characters scattered over various nested if-else if. Also the compiler is less likely to translate the if-else if to some table look-up, more likely this would generate a bunch of branches which are very bad for loop performance.

A look-up table followed by a centralized "take action depending on result" code like for example a switch would improve execution speed and readability/maintainability both.

The simplest form of such a table lookup would be to strchr("\"\\\n/*\'", input) then take different actions based on if strchr returned NULL or not.

So rather than defining all comment characters with macros, you'd rather have a typedef enum { DOUBLE_QUOTE_CHAR, BACK_SLASH, ... NO_COMMENT } comment_t; etc corresponding to the index passed to the string literal used by strchr. Then you can do:

  const char  comment_characters[] = "\"\\\n/*\'";  const char* comment_found = strchr(comment_characters, input)  comment_t   comm;  if(comment_found)    comm = (comment_t) (comment_found - comment_characters); // pointer diff arithmetic  else // strchr returned NULL    comm = NO_COMMENT;  switch(comm)  {     case DOUBLE_QUOTE_CHAR:  /* do double quote stuff */  break;     case BACK_SLASH:         /* do backslash stuff */     break;     default:                 /* NO_COMMENT etc, do nothing */  }

You can even take readability/maintainability a bit further almost to extremes by doing this instead:

  const char comment_characters [N] = // where N is some size "large enough"  {    [DOUBLE_QUOTE_CHAR] = '\"',    [BACK_SLASH] = '\\',    [N-1] = '\0',                     // strictly speaking not necessary but being explicit is nice  };

This guarantees integrity between the string and the enum indices.


Viewing all articles
Browse latest Browse all 42

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>