【MEDIUM】Remove Comments

发布于: 2018-12-19 10:23
阅读: 63
评论: 0
喜欢: 0

问题

原题链接:https://leetcode.com/problems/remove-comments/

Given a C++ program, remove comments from it. The program source is an array where source[i] is the i-th line of the source code. This represents the result of splitting the original source code string by the newline character \n.

In C++, there are two types of comments, line comments, and block comments.

The string // denotes a line comment, which represents that it and rest of the characters to the right of it in the same line should be ignored.

The string /* denotes a block comment, which represents that all characters until the next (non-overlapping) occurrence of */ should be ignored. (Here, occurrences happen in reading order: line by line from left to right.) To be clear, the string /*/ does not yet end the block comment, as the ending would be overlapping the beginning.

The first effective comment takes precedence over others: if the string // occurs in a block comment, it is ignored. Similarly, if the string /* occurs in a line or block comment, it is also ignored.

If a certain line of code is empty after removing comments, you must not output that line: each string in the answer list will be non-empty.

There will be no control characters, single quote, or double quote characters. For example, source = "string s = "/* Not a comment. */";" will not be a test case. (Also, nothing else such as defines or macros will interfere with the comments.)

It is guaranteed that every open block comment will eventually be closed, so /* outside of a line or block comment always starts a new comment.

Finally, implicit newline characters can be deleted by block comments. Please see the examples below for details.

After removing the comments from the source code, return the source code in the same format.

Example 1:

Input: 
source = ["/*Test program */", "int main()", "{ ", "  // variable declaration ", "int a, b, c;", "/* This is a test", "   multiline  ", "   comment for ", "   testing */", "a = b + c;", "}"]

The line by line code is visualized as below:
/*Test program */
int main()
{ 
  // variable declaration 
int a, b, c;
/* This is a test
   multiline  
   comment for 
   testing */
a = b + c;
}

Output: ["int main()","{ ","  ","int a, b, c;","a = b + c;","}"]

The line by line code is visualized as below:
int main()
{ 
  
int a, b, c;
a = b + c;
}

Explanation: 
The string /* denotes a block comment, including line 1 and lines 6-9. The string // denotes line 4 as comments.

Example 2:

Input: 
source = ["a/*comment", "line", "more_comment*/b"]
Output: ["ab"]
Explanation: The original source string is "a/*comment\nline\nmore_comment*/b", where we have bolded the newline characters.  After deletion, the implicit newline characters are deleted, leaving the string "ab", which when delimited by newline characters becomes ["ab"].

Note:

  • The length of source is in the range [1, 100].
  • The length of source[i] is in the range [0, 80].
  • Every open block comment is eventually closed.
  • There are no single-quote, double-quote, or control characters in the source code.

分析过程

  • 输入:带有注释的源代码数组,每个元素是一行
  • 输出:注释去掉以后的源代码
  • 思路:分为单行注释和多行注释两种。因为题目已经规定了没有开环,没有嵌套,因此可以不考虑这类特殊情况。单行注释如果检查到是//的话,直接打断这一行。多行注释要利用状态机,检查到/*变更状态为多行注释,检查到*/状态结束。状态之中的代码均是注释内容,不输出。另外需要考虑多行注释里拼接两行的问题,避免溢出。

解决方法

class Solution {
public:
    vector<string> removeComments(vector<string>& source) {
        vector<string> result;
        
        // 多行注释标志
        bool multiLineComment = false;
        // 一行的结果
        // 题目里一行最多 80 个字符,但是考虑多行注释会拼接两行,所以取 160
        char lineResult[160] = {};
        // 输出一行时候的索引
        int lineResultIndex = 0;
        
        // 遍历每一行
        for (string lineStr : source) {
            const char *line = lineStr.c_str();
            int len = (int)strlen(line);
            
            // 如果当前不是多行注释状态,重置当前行结果
            if (!multiLineComment) {
                memset(lineResult, 0, 80);
                lineResultIndex = 0;
            }
            
            // 行结束标志,仅用于单行注释的情况
            bool endThisLine = false;
            
            // 遍历当前行的每一个字符
            for (int i = 0; i < len; i++) {
                // 仅单行注释时候才会进这里,如果是单行注释直接跳出
                if (endThisLine) {
                    break;
                }
                if (i < len - 1) {
                    if (!multiLineComment && line[i] == '/' && line[i + 1] == '/') {
                        // 单行注释情况,直接打断
                        endThisLine = true;
                    } else if (!multiLineComment && line[i] == '/' && line[i + 1] == '*') {
                        // 多行注释开始
                        multiLineComment = true;
                        // 同时要把 i 往后移动,否则会把 * 输出来
                        ++i;
                    } else if (multiLineComment && line[i] == '*' && line[i + 1] == '/') {
                        // 多行注释结束
                        multiLineComment = false;
                        // 同上
                        ++i;
                    } else if (!multiLineComment) {
                        // 其他字符情况,如果不是在多行注释之中,输出
                        lineResult[lineResultIndex++] = line[i];
                    }
                } else if (!multiLineComment) {
                    // 同上
                    lineResult[lineResultIndex++] = line[i];
                }
            }
            // 判断结果有没有内容,有内容才输出
            if (!multiLineComment && strlen(lineResult) > 0) {
                result.push_back(lineResult);
            }
        }
        
        return result;
    }

};

Thanks for reading.

All the best wishes for you! 💕