问题引入:

给定字符串,写代码找出在字符串中出现次数最多的字符,出现次数相等的字符按照字典序输出。

问题剖析:

  • 考虑错误处理(字符串为空)
  • 考虑出现最多次数的字符不是一个,也即存在出现次数相等的字符
  • 出现次数相同的字符需要按照字典顺序排序

代码剖析:

package com.nokia.pats;

import com.google.common.base.Strings;  // used guava library

import java.util.Collections;
import java.util.Comparator;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;

public class MostOccurLetters {
  public MostOccurLetters() {
  }

  public List<Map.Entry<String, Integer>> countTheMostOccurLetters(String letters) {
    /*
    * 1. if the input argument is not reasonable, just return an 
    * empty List, which means no letter comply to the rule of the 
    * question.
    */
    if (Strings.isNullOrEmpty(letters)) { 
      return Collections.emptyList();
    }
    /*
    * 2. create one temp map to store the splitted letters, with corrsponding counter.
    */
    Map<String, Integer> lettersByCount = new HashMap<>();
    for (char c : letters.toCharArray()) {
      lettersByCount.put(String.valueOf(c),
            lettersByCount.getOrDefault(String.valueOf(c), 0) + 1);
    }
    /*
    * 3. below code with lambda will sort the sort the map by the
    *  counter, if the counter is equal, sort with the key by dic 
    * sequence.
    */
    List<Map.Entry<String, Integer>> result = lettersByCount.entrySet().stream().sorted(
          Comparator.comparing(Map.Entry<String, Integer>::getValue).reversed()
              .thenComparing(Map.Entry<String, Integer>::getKey))
                .collect(Collectors.toList());
    /*
    * 4. filter out the items which comply to the rule, which
    *  eauals = the most occur letters.
    */
    return result.stream().filter(entry ->
          entry.getValue() == result.get(0).getValue()).collect(Collectors.toList());
  }
}
  1. 代码使用了google流行的guava库,库中有很多有用的工具类,减少再次造轮子的可能性。本代码中使用了guava中的Strings工具类,能对String做一些常用的操作,包括本代码中使用的字符串的Null和empty判断。
  2. 创建一个临时的map容器,用于存储分割自字符串的字符,并实现字符的统计。键是字符本身,值是出现次数。
  3. 实现对上述容器的排序,首先对次数进行一次排序,如果出现的次数相同则进行第二次排序,得到一个有序容器表述。
  4. 开始过滤和第一个元素出现次数相等的元素,并将结果返回。

至此,完成对代码的编写,下边对代码进行UT cover。

package com.nokia.pats;

import org.junit.Assert;
import org.junit.Before;
import org.junit.Test;

import java.util.List;
import java.util.Map;

public class MostOccurLettersTest {
  private MostOccurLetters mostOccurLetters;

  @Before
  public void setUp() {
    mostOccurLetters = new MostOccurLetters();
  }

  @Test
  public void should_return_empty_list_when_input_is_null() {
    String letters = null;
    List<Map.Entry<String, Integer>> mostoccurs = mostOccurLetters.countTheMostOccurLetters(letters);
    // should return empty list
    Assert.assertTrue(0 == mostoccurs.size());
  }

  @Test
  public void should_return_empty_list_when_input_is_empty() {
    String letters = "";
    List<Map.Entry<String, Integer>> mostoccurs = mostOccurLetters.countTheMostOccurLetters(letters);
    // should return empty list
    Assert.assertTrue(0 == mostoccurs.size());
  }

  @Test
  public void should_return_the_most_occur_letters() {
    String letters = "abccddrrrrr";
    List<Map.Entry<String, Integer>> mostoccurs = mostOccurLetters.countTheMostOccurLetters(letters);
    // should has only one items in letters which is "r":5;
    Assert.assertTrue(1 == mostoccurs.size());
    Assert.assertEquals("r", mostoccurs.get(0).getKey());
    Assert.assertEquals(new Integer(5), mostoccurs.get(0).getValue());
  }

  @Test
  public void should_return_2items_when_exist_two_equal_occur_letters() {
    String letters = "cccbbaaa";
    List<Map.Entry<String, Integer>> mostoccurs = mostOccurLetters.countTheMostOccurLetters(letters);
    // should return two items, with corrsponding sort first on counter, then by letter itself.
    Assert.assertTrue(2 == mostoccurs.size());
    Assert.assertEquals("a", mostoccurs.get(0).getKey());
    Assert.assertEquals(new Integer(3), mostoccurs.get(0).getValue());
    Assert.assertEquals("c", mostoccurs.get(1).getKey());
    Assert.assertEquals(new Integer(3), mostoccurs.get(1).getValue());
  }
}
  1. 第一个UT测试,如果字符串为null,那么返回的list为空。在java代码尽量不要返回null,在返回值是List的情况下,如果允许返回null,那么会出现“null”和“list空”两个模糊的语义,因为这两个均可以表示没有意义或者不存在的含义 ,于是会对代码的可读性有伤害。另外,也可以考虑Java的Optional。
  2. 第二个UT测试,如果字符串的长度为0,那么返回长度为0的List。和第一个UT测试类似。
  3. 第三个UT测试,如果一段字符串中出现只有一个字符出现的次数最多的情况下,最后返回的List的长度必须是1,然后对应的键和值,是出现次数最多的字符及其次数。
  4. 第四个UT测试,如果出现两个字符出现的次数相等的情况下,返回的List长度是2,第一个元素是字典序在前面的元素,第二个元素是字典序次之的元素,以此类推。