有 Java 编程相关的问题?

你可以在下面搜索框中键入要查询的问题!

java字符串列表中的单词频率

我有一个字符串列表:

List<String> terms = ["Coding is great", "Search Engines are great", "Google is a nice search engine"]

如何获取列表中每个单词的频率: 例如{Coding:1, Search:2, Engines:1, engine:1, ....}

这是我的密码:

    Map<String, Integer> wordFreqMap = new HashMap<>(); 
    for (String contextTerm : term.getContexTerms()  ) 
                {
                    String[] wordsArr = contextTerm.split(" ");
                    for (String  word : wordsArr) 
                    {
                        Integer freq = wordFreqMap.get(word); //this line is getting reset every time I goto a new COntexTerm
                        freq = (freq == null) ? 1: ++freq;
                        wordFreqMap.put(word, freq);
                    }
                }

共 (3) 个答案

  1. # 1 楼答案

    public static void main(String[] args) {
        String msg="Coding is great search Engines are great Google is a nice search engine";                   
        ArrayList<String> list2 = new ArrayList<>();
        Map map = new HashMap();
        list2.addAll((List)Arrays.asList(msg.split(" ")));
        String n[]=msg.split(" ");
        int f=0;
        for(int i=0;i<n.length;i++){
             f=Collections.frequency(list2, n[i]);
             map.put(n[i],f);
        }     
        System.out.println("values are "+map);
    }
    
  2. # 2 楼答案

    使用Java 8流的惯用解决方案:

    import java.util.Arrays;
    import java.util.List;
    import java.util.Map;
    import java.util.stream.Collectors;
    
    public class SplitWordCount
    {
        public static void main(String[] args)
        {
            List<String> terms = Arrays.asList(
                "Coding is great",
                "Search Engines are great",
                "Google is a nice search engine");
    
            Map<String, Integer> result = terms.parallelStream().
                flatMap(s -> Arrays.asList(s.split(" ")).stream()).
                collect(Collectors.toConcurrentMap(
                    w -> w.toLowerCase(), w -> 1, Integer::sum));
            System.out.println(result);
        }
    }
    

    请注意,您可能需要考虑字符串的大小写是否应该发挥作用。这一个将字符串转换为小写,并将其用作最终贴图的键。结果是:

    {coding=1, a=1, search=2, are=1, engine=1, engines=1, 
         is=2, google=1, great=2, nice=1}
    
  3. # 3 楼答案

    因为Java 8的答案虽然很好,但没有向您展示如何在Java 7中进行并行(除了默认实现之外,与stream相同),下面是一个示例:

      public static void main(final String[] args) throws InterruptedException {
    
        final ExecutorService service = Executors.newFixedThreadPool(10);
    
        final List<String> terms = Arrays.asList("Coding is great", "Search Engines are great",
            "Google is a nice search engine");
    
        final List<Callable<String[]>> callables = new ArrayList<>(terms.size());
        for (final String term : terms) {
          callables.add(new Callable<String[]>() {
    
            @Override
            public String[] call() throws Exception {
              System.out.println("splitting word: " + term);
              return term.split(" ");
            }
          });
        }
    
        final ConcurrentMap<String, AtomicInteger> counter = new ConcurrentHashMap<>();
        final List<Callable<Void>> callables2 = new ArrayList<>(terms.size());
        for (final Future<String[]> future : service.invokeAll(callables)) {
          callables2.add(new Callable<Void>() {
    
            @Override
            public Void call() throws Exception {
              System.out.println("counting word");
              // invokeAll implies that the future finished it work
              for (String word : future.get()) {
                String lc = word.toLowerCase();
                // here it get tricky. Two thread might add the same word.
                AtomicInteger actual = counter.get(lc);
                if (null == actual) {
                  final AtomicInteger nv = new AtomicInteger();
                  actual = counter.putIfAbsent(lc, nv);
                  if (null == actual) {
                    actual = nv; // nv got added.
                  }
                }
                actual.incrementAndGet();
              }
              return null;
            }
          });
        }
        service.invokeAll(callables2);
        service.shutdown();
    
        System.out.println(counter);
    
      }
    

    是的,Java8简化了工作

    没有,我测试过它,但不知道它是否比简单的循环好,也不知道它是否完全线程安全

    (看看你如何定义你的列表,你不是在Groovy中编码吗?Groovy中有并行性支持)