首页 诗词 字典 板报 句子 名言 友答 励志 学校 网站地图
当前位置: 首页 > 教程频道 > 开发语言 > 编程 >

分布式系统中某个节点随机响应时间升高解析[String常量池招致dumpThread变慢]

2013-12-26 
分布式系统中某个节点随机响应时间升高解析[String常量池导致dumpThread变慢]一, 问题描述故事要从半年前

分布式系统中某个节点随机响应时间升高解析[String常量池导致dumpThread变慢]

一, 问题描述

故事要从半年前说起....

大体的问题描述是:

约一个季度以上的时间内, 我们发现我们系统的5台机器的90%响应时间都是每天都有随机的一台比其他机器高出20ms-30ms.

在 同事的努力下开始寻找这个问题, 主要怀疑是定时任务导致的. 经过不懈的努力, 一段时间的结果是发现 fastjson 进行反序列化的 json的key 做缓存, 放到static域里的一个List不释放. 大约这个List有65w大小. 然后就导致Log打印从以前的1ms 升到了20ms.

?

二, 问题分析和寻找

1, 从观察到上述的结果后,? 我们开始寻找具体导致这个问题的原因. 模型就是[65w的List String不释放] --> [Log.info一次20ms以上]

观察fastjson的源码发现, 这65w的string存储不释放可能导致的原因比较不好寻找, 不过在思考后, 直觉是由可能这两者都是在操作字符串, 会不会是字符串常量池导致的.

fastjson的缓存key代码如下:

?

/** * Constructs a new entry from the specified symbol information and next entry reference. */public Entry(char[] ch, int offset, int length, int hash, Entry next){    characters = new char[length];    System.arraycopy(ch, offset, characters, 0, length);    symbol = new String(characters).intern();    this.next = next;    this.hashCode = hash;    this.bytes = null;}public Entry(String text, int offset, int length, int hash, Entry next){    symbol = text.substring(offset, offset + length).intern();    characters = symbol.toCharArray();    this.next = next;    this.hashCode = hash;    this.bytes = null;}

?

上边这块代码是在fastjson使用自己实现的SymbolTable 缓存反序列化对象的key时 存储的数据结构和代码.?? 这里的key都会进行一次 intern 操作. 为了放入全局常量池中, 加快速度. 但这里我们调用的 商家系统 等价接口返回的key是变化的. 所以会导致缓存的key特别多.? 知道65w数据而不释放.

?

2, 看到了fastjson使用了intern操作, 敏感的意识到intern操作常量池后, 会导致Log打印日志时操作字符串变慢. 来看String.intern 的实现:

?

/** * Returns a canonical representation for the string object. * <p> * A pool of strings, initially empty, is maintained privately by the * class <code>String</code>. * <p> * When the intern method is invoked, if the pool already contains a * string equal to this <code>String</code> object as determined by * the {@link #equals(Object)} method, then the string from the pool is * returned. Otherwise, this <code>String</code> object is added to the * pool and a reference to this <code>String</code> object is returned. * <p> * It follows that for any two strings <code>s</code> and <code>t</code>, * <code>s.intern()&nbsp;==&nbsp;t.intern()</code> is <code>true</code> * if and only if <code>s.equals(t)</code> is <code>true</code>. * <p> * All literal strings and string-valued constant expressions are * interned. String literals are defined in section 3.10.5 of the * <cite>The Java&trade; Language Specification</cite>. * * @return  a string that has the same contents as this string, but is *          guaranteed to be from a pool of unique strings. */public native String intern();

??

好吧, 你们看到了.? intern 方法是个native的. 这一点以前没有跟过感觉有点麻烦了.

看方法的注释里如下解释:

?

When the intern method is invoked, if the pool already contains a * string equal to this <code>String</code> object as determined by * the {@link #equals(Object)} method, then the string from the pool is * returned. Otherwise, this <code>String</code> object is added to the * pool and a reference to this <code>String</code> object is returned.

?

注释中明确说明, 如果常量池中存在当前字符串, 就会直接返回当前字符串.? 如果常量池中没有此字符串, 会将此字符串放入常量池中后, 再返回.

从这里可以明确确定, String.intern的操作增大了常量池. 具体常量池是怎么运行的得继续寻找native代码.

?

3, 测试到底是不是intern操作导致的Log变慢:

?

private static final List<Object> finalStringList = new ArrayList<Object>();public void fillPermGenMemory() {    for (int i = 0; i < 1000000; i++) {        finalStringList.add(String.valueOf(i).intern());    }}public JsonResponse testLogIntern() {    long start = System.currentTimeMillis();    System.out.println("start=" + start);    logger.info("hao......");    logger.info("hao......");    logger.info("hao......");    logger.info("hao......");    long end = System.currentTimeMillis();    long c = end - start;    return new JsonResponse().addData(c);}

?

这样运行 fillPermGenMemory()方法后, 日志打印就到100ms了.?

若将 finalStringList.add(String.valueOf(i).intern()); 这句代码中的intern操作去掉, 变成 finalStringList.add(String.valueOf(i)); 后. 日志打印时间就变成1ms之内了.

因此通过上述的测试确定 Log变慢跟intern有十分强烈的联系. 接下来就要寻找具体的联系了.

?

4, 在网上找资料看String的常量池实现. 发现资料不是想象中的那么多, 而且1.6和1.7还有不小的区别

http://hellojava.info/?p=61

这篇文章中有提到 1.7 的jdk将String的常量池移到了非Perm区的队里, 轻易不会导致OOM了.

而且这里也比较明确的说道:

?

String.intern放进的String Pool是一个固定大小的Hashtable,默认值是1009,如果放进String Pool的String非常多,就会造成Hash冲突严重,从而导致链表会很长,而链表长了后直接会造成的影响就是当调用String.intern时性能会大幅下降(因为要一个一个找)。

?以及继续看他的native实现代码:

?\openjdk7\jdk\src\share\native\java\lang\String.c

?

Java_java_lang_String_intern(JNIEnv *env, jobject this){    return JVM_InternString(env, this);}

??

?\openjdk7\hotspot\src\share\vm\prims\jvm.h?

?

/** java.lang.String*/JNIEXPORT jstring JNICALLJVM_InternString(JNIEnv *env, jstring str); C:\cygwin\home\john\workspace\openjdk7\hotspot\src\share\vm\prims\jvm.cpp// String support ///////////////////////////////////////////////////////////////////////////JVM_ENTRY(jstring, JVM_InternString(JNIEnv *env, jstring str))  JVMWrapper("JVM_InternString");  JvmtiVMObjectAllocEventCollector oam;  if (str == NULL) return NULL;  oop string = JNIHandles::resolve_non_null(str);  oop result = StringTable::intern(string, CHECK_NULL);  return (jstring) JNIHandles::make_local(env, result);JVM_END 

?

\openjdk7\hotspot\src\share\vm\classfile\symbolTable.cpp

oop StringTable::intern(Handle string_or_null, jchar* name,                        int len, TRAPS) {  unsigned int hashValue = java_lang_String::hash_string(name, len);  int index = the_table()->hash_to_index(hashValue);  oop string = the_table()->lookup(index, name, len, hashValue);  // Found  if (string != NULL) return string;  // Otherwise, add to symbol to table  return the_table()->basic_add(index, string_or_null, name, len,                                hashValue, CHECK_NULL);} 

?

\openjdk7\hotspot\src\share\vm\classfile\symbolTable.cpp

?

oop StringTable::lookup(int index, jchar* name,                        int len, unsigned int hash) {  for (HashtableEntry<oop>* l = bucket(index); l != NULL; l = l->next()) {    if (l->hash() == hash) {      if (java_lang_String::equals(l->literal(), name, len)) {        return l->literal();      }    }  }  return NULL;}

?

大体的结构就是:

public JsonResponse testFormatIntern() { long start = System.currentTimeMillis(); System.out.println("start=" + start); System.out.println(MessageFormatter.format("zhe shi yige {} de rizi. {} duibudui ?", new Random().nextDouble(), new Random().nextDouble()).getMessage()); System.out.println(MessageFormatter.format("zhe shi yige {} de rizi2. {} duibudui ?", new Random().nextDouble(), new Random().nextDouble()).getMessage()); System.out.println(MessageFormatter.format("zhe shi yige {} de rizi3. {} duibudui ?", new Random().nextDouble(), new Random().nextDouble()).getMessage()); System.out.println(MessageFormatter.format("zhe shi yige {} de rizi4. {} duibudui ?", new Random().nextDouble(), new Random().nextDouble()).getMessage()); long end = System.currentTimeMillis(); long c = end - start; return new JsonResponse().addData(c);}

?

但是发现这个的时间在 intern 增大后几乎没有任何的影响, 还是1 2ms.. 这里就郁闷了, 发现走错路了.

?

6, 然后从这里开始使用housemd 工具对Logger.info() 方法进行时间的跟踪.

首先是 trace ?SLF4JLogger. 如下结果:

?

org/slf4j/impl/SLF4JLogger.debug(String)                                            sun.misc.Launcher$AppClassLoader@109aca82            2           <1ms    org.slf4j.impl.SLF4JLogger@6039fe60org/slf4j/impl/SLF4JLogger.debug(String, Object[])                                  sun.misc.Launcher$AppClassLoader@109aca82            2           <1ms    org.slf4j.impl.SLF4JLogger@7f6b81faorg/slf4j/impl/SLF4JLogger.getLevel(int)                                            sun.misc.Launcher$AppClassLoader@109aca82           14            1ms    org.slf4j.impl.SLF4JLogger@3ede1597org/slf4j/impl/SLF4JLogger.info(String)                                             sun.misc.Launcher$AppClassLoader@109aca82            2           41ms    org.slf4j.impl.SLF4JLogger@67c5fef9org/slf4j/impl/SLF4JLogger.isDebugEnabled()                                         sun.misc.Launcher$AppClassLoader@109aca82            1           <1ms    org.slf4j.impl.SLF4JLogger@665b8cbforg/slf4j/impl/SLF4JLogger.isTraceEnabled()                                         sun.misc.Launcher$AppClassLoader@109aca82            2           <1ms    org.slf4j.impl.SLF4JLogger@708cd580org/slf4j/impl/SLF4JLogger.log(Marker, String, int, String, Object[], Throwable)    sun.misc.Launcher$AppClassLoader@109aca82           14            9ms    org.slf4j.impl.SLF4JLogger@3ede1597

?

?org/slf4j/impl/SLF4JLogger.info(String) 这个方法时间最长.?

实现代码如下:

?

@Overridepublic void info(final String format) {    if (logger.isInfoEnabled()) {        logger.log(null, FQCN, Level.INFO, new SimpleMessage(format), null);    }}

?找到logger的实现类继续 trace AbstractLoggerWrapper:

?

org/apache/logging/log4j/spi/AbstractLoggerWrapper.isEnabled(Level, Marker, Object, Throwable)       sun.misc.Launcher$AppClassLoader@109aca82            1            1ms    org.apache.logging.log4j.spi.AbstractLoggerWrapper@56d31953org/apache/logging/log4j/spi/AbstractLoggerWrapper.isEnabled(Level, Marker, String)                  sun.misc.Launcher$AppClassLoader@109aca82            1           <1ms    org.apache.logging.log4j.spi.AbstractLoggerWrapper@d8c5450org/apache/logging/log4j/spi/AbstractLoggerWrapper.log(Marker, String, Level, Message, Throwable)    sun.misc.Launcher$AppClassLoader@109aca82            2           38ms    org.apache.logging.log4j.spi.AbstractLoggerWrapper@1ac28616

?继续trace AsyncLogger

?

org/apache/logging/log4j/core/async/AsyncLogger.actualAsyncLog(RingBufferLogEvent)                sun.misc.Launcher$AppClassLoader@109aca82            1            1ms    org.apache.logging.log4j.core.async.AsyncLogger@19de86bborg/apache/logging/log4j/core/async/AsyncLogger.location(String)                                  sun.misc.Launcher$AppClassLoader@109aca82            1           30ms    org.apache.logging.log4j.core.async.AsyncLogger@19de86bborg/apache/logging/log4j/core/async/AsyncLogger.log(Marker, String, Level, Message, Throwable)    sun.misc.Launcher$AppClassLoader@109aca82            1           61ms    org.apache.logging.log4j.core.async.AsyncLogger@19de86bb

?这里可以看一下?AsyncLogger.location 这个方法比较异常. 里边主要是调用了 return Log4jLogEvent.calcLocation(fqcnOfLogger); ??Log4jLogEvent.calcLocation()的代码如下:

?

public static StackTraceElement calcLocation(final String fqcnOfLogger) {    if (fqcnOfLogger == null) {        return null;    }    final StackTraceElement[] stackTrace = Thread.currentThread().getStackTrace();    boolean next = false;    for (final StackTraceElement element : stackTrace) {        final String className = element.getClassName();        if (next) {            if (fqcnOfLogger.equals(className)) {                continue;            }            return element;        }        if (fqcnOfLogger.equals(className)) {            next = true;        } else if (NOT_AVAIL.equals(className)) {            break;        }    }    return null;}

?

?从这一步做了一些测试最后得出是 ?Thread.currentThread().getStackTrace(); 这段代码耗费的时间最长. 因此下一步就要继续跟进这一段代码.

?

7, 这里才考虑到可能是?Thread.currentThread().getStackTrace(); 调用的时候, 因为这个方法主要是获取当前方法的调用栈, 特别是在打印异常的时候会用到. ?因此此方法会去常量池中取 java的类名, 方法名, 如果常量池比较答, 所以搜索时间会比较长. 因此根据这个思路去跟他的native的方法实现.

具体的过程如下:

?

public StackTraceElement[] getStackTrace() {    if (this != Thread.currentThread()) {        // check for getStackTrace permission        SecurityManager security = System.getSecurityManager();        if (security != null) {            security.checkPermission(                SecurityConstants.GET_STACK_TRACE_PERMISSION);        }        // optimization so we do not call into the vm for threads that        // have not yet started or have terminated        if (!isAlive()) {            return EMPTY_STACK_TRACE;        }        StackTraceElement[][] stackTraceArray = dumpThreads(new Thread[] {this});        StackTraceElement[] stackTrace = stackTraceArray[0];        // a thread that was alive during the previous isAlive call may have        // since terminated, therefore not having a stacktrace.        if (stackTrace == null) {            stackTrace = EMPTY_STACK_TRACE;        }        return stackTrace;    } else {        // Don't need JVM help for current thread        return (new Exception()).getStackTrace();    }}private native static StackTraceElement[][] dumpThreads(Thread[] threads); 

?

?

?\openjdk7\jdk\src\share\native\java\lang\Thread.c

?

static JNINativeMethod methods[] = {    {"start0",           "()V",        (void *)&JVM_StartThread},    {"stop0",            "(" OBJ ")V", (void *)&JVM_StopThread},    {"isAlive",          "()Z",        (void *)&JVM_IsThreadAlive},    {"suspend0",         "()V",        (void *)&JVM_SuspendThread},    {"resume0",          "()V",        (void *)&JVM_ResumeThread},    {"setPriority0",     "(I)V",       (void *)&JVM_SetThreadPriority},    {"yield",            "()V",        (void *)&JVM_Yield},    {"sleep",            "(J)V",       (void *)&JVM_Sleep},    {"currentThread",    "()" THD,     (void *)&JVM_CurrentThread},    {"countStackFrames", "()I",        (void *)&JVM_CountStackFrames},    {"interrupt0",       "()V",        (void *)&JVM_Interrupt},    {"isInterrupted",    "(Z)Z",       (void *)&JVM_IsInterrupted},    {"holdsLock",        "(" OBJ ")Z", (void *)&JVM_HoldsLock},    {"getThreads",        "()[" THD,   (void *)&JVM_GetAllThreads},    {"dumpThreads",      "([" THD ")[[" STE, (void *)&JVM_DumpThreads},}; 

??

\openjdk7\hotspot\src\share\vm\prims\jvm.h 294:

?

/* getStackTrace() and getAllStackTraces() method */JNIEXPORT jobjectArray JNICALLJVM_DumpThreads(JNIEnv *env, jclass threadClass, jobjectArray threads); C:\cygwin\home\john\workspace\openjdk7\hotspot\src\share\vm\prims\jvm.cpp4382-4414:// Support for java.lang.Thread.getStackTrace() and getAllStackTraces() methods// Return StackTraceElement[][], each element is the stack trace of a thread in// the corresponding entry in the given threads arrayJVM_ENTRY(jobjectArray, JVM_DumpThreads(JNIEnv *env, jclass threadClass, jobjectArray threads))  JVMWrapper("JVM_DumpThreads");  JvmtiVMObjectAllocEventCollector oam;  // Check if threads is null  if (threads == NULL) {    THROW_(vmSymbols::java_lang_NullPointerException(), 0);  }  objArrayOop a = objArrayOop(JNIHandles::resolve_non_null(threads));  objArrayHandle ah(THREAD, a);  int num_threads = ah->length();  // check if threads is non-empty array  if (num_threads == 0) {    THROW_(vmSymbols::java_lang_IllegalArgumentException(), 0);  }  // check if threads is not an array of objects of Thread class  klassOop k = objArrayKlass::cast(ah->klass())->element_klass();  if (k != SystemDictionary::Thread_klass()) {    THROW_(vmSymbols::java_lang_IllegalArgumentException(), 0);  }  ResourceMark rm(THREAD);  GrowableArray<instanceHandle>* thread_handle_array = new GrowableArray<instanceHandle>(num_threads);  for (int i = 0; i < num_threads; i++) {    oop thread_obj = ah->obj_at(i);    instanceHandle h(THREAD, (instanceOop) thread_obj);    thread_handle_array->append(h);  }  Handle stacktraces = ThreadService::dump_stack_traces(thread_handle_array, num_threads, CHECK_NULL);  return (jobjectArray)JNIHandles::make_local(env, stacktraces());JVM_END 

?

?

\openjdk7\hotspot\src\share\vm\services\threadService.cpp?

235-267:?

?

// Dump stack trace of threads specified in the given threads array.// Returns StackTraceElement[][] each element is the stack trace of a thread in// the corresponding entry in the given threads arrayHandle ThreadService::dump_stack_traces(GrowableArray<instanceHandle>* threads,                                        int num_threads,                                        TRAPS) {  assert(num_threads > 0, "just checking");  ThreadDumpResult dump_result;  VM_ThreadDump op(&dump_result,                   threads,                   num_threads,                   -1,    /* entire stack */                   false, /* with locked monitors */                   false  /* with locked synchronizers */);  VMThread::execute(&op);  // Allocate the resulting StackTraceElement[][] object  ResourceMark rm(THREAD);  klassOop k = SystemDictionary::resolve_or_fail(vmSymbols::java_lang_StackTraceElement_array(), true, CHECK_NH);  objArrayKlassHandle ik (THREAD, k);  objArrayOop r = oopFactory::new_objArray(ik(), num_threads, CHECK_NH);  objArrayHandle result_obj(THREAD, r);  int num_snapshots = dump_result.num_snapshots();  assert(num_snapshots == num_threads, "Must have num_threads thread snapshots");  int i = 0;  for (ThreadSnapshot* ts = dump_result.snapshots(); ts != NULL; i++, ts = ts->next()) {    ThreadStackTrace* stacktrace = ts->get_stack_trace();    if (stacktrace == NULL) {      // No stack trace      result_obj->obj_at_put(i, NULL);    } else {      // Construct an array of java/lang/StackTraceElement object      Handle backtrace_h = stacktrace->allocate_fill_stack_trace_element_array(CHECK_NH);      result_obj->obj_at_put(i, backtrace_h());    }  }  return result_obj;}

?

?

\openjdk7\hotspot\src\share\vm\services\threadService.cpp 566-577:

?

Handle ThreadStackTrace::allocate_fill_stack_trace_element_array(TRAPS) {  klassOop k = SystemDictionary::StackTraceElement_klass();  assert(k != NULL, "must be loaded in 1.4+");  instanceKlassHandle ik(THREAD, k);  // Allocate an array of java/lang/StackTraceElement object  objArrayOop ste = oopFactory::new_objArray(ik(), _depth, CHECK_NH);  objArrayHandle backtrace(THREAD, ste);  for (int j = 0; j < _depth; j++) {    StackFrameInfo* frame = _frames->at(j);    methodHandle mh(THREAD, frame->method());    oop element = java_lang_StackTraceElement::create(mh, frame->bci(), CHECK_NH);    backtrace->obj_at_put(j, element);  }  return backtrace;} 

?

?

?

\openjdk7\hotspot\src\share\vm\classfile\javaClasses.cpp 1635-[1651,1654,1658]:

?

oop java_lang_StackTraceElement::create(methodHandle method, int bci, TRAPS) {  // SystemDictionary::stackTraceElement_klass() will be null for pre-1.4 JDKs  assert(JDK_Version::is_gte_jdk14x_version(), "should only be called in >= 1.4");  // Allocate java.lang.StackTraceElement instance  klassOop k = SystemDictionary::StackTraceElement_klass();  assert(k != NULL, "must be loaded in 1.4+");  instanceKlassHandle ik (THREAD, k);  if (ik->should_be_initialized()) {    ik->initialize(CHECK_0);  }  Handle element = ik->allocate_instance_handle(CHECK_0);  // Fill in class name  ResourceMark rm(THREAD);  const char* str = instanceKlass::cast(method->method_holder())->external_name();  oop classname = StringTable::intern((char*) str, CHECK_0);  java_lang_StackTraceElement::set_declaringClass(element(), classname);  // Fill in method name  oop methodname = StringTable::intern(method->name(), CHECK_0);  java_lang_StackTraceElement::set_methodName(element(), methodname);  // Fill in source file name  Symbol* source = instanceKlass::cast(method->method_holder())->source_file_name();  oop filename = StringTable::intern(source, CHECK_0);  java_lang_StackTraceElement::set_fileName(element(), filename);  // File in source line number  int line_number;  if (method->is_native()) {    // Negative value different from -1 below, enabling Java code in    // class java.lang.StackTraceElement to distinguish "native" from    // "no LineNumberTable".    line_number = -2;  } else {    // Returns -1 if no LineNumberTable, and otherwise actual line number    line_number = method->line_number_from_bci(bci);  }  java_lang_StackTraceElement::set_lineNumber(element(), line_number);  return element();}

?

?

?经过漫长的jvm源码跟踪, 走过了几个岔路, 还是终于走上了正途, 最后的三句代码问题就是

?

?

oop classname = StringTable::intern((char*) str, CHECK_0);oop methodname = StringTable::intern(method->name(), CHECK_0);oop filename = StringTable::intern(source, CHECK_0);

?

?

这三段代码是获取类名, 方法名, 和 文件名. 去常量池中搜索的时候性能急剧下降.?

而且一般的StackTrace的长度在70左右, 因此一般一条log语句要调用 210次左右的intern操作. ?性能耗费也是很巨大

?

8, 好了, 基本过程结束, 就来做最后的验证工作:

?

public JsonResponse testGetStackTraceIntern() throws Exception{    StackTraceElement[] stackTraceElements = Thread.currentThread().getStackTrace();    long start = System.currentTimeMillis();    System.out.println("start=" + start);    for (int i = 0; i < 4; i++) {        for (StackTraceElement stackTraceElement : stackTraceElements) {            stackTraceElement.getClassName().intern();            stackTraceElement.getMethodName().intern();            if (stackTraceElement.getFileName() != null) stackTraceElement.getFileName().intern();        }    }    long end = System.currentTimeMillis();    long c = end - start;    return new JsonResponse().addData(c);}

??这里是模拟了StackTrace调用4次的intern操作结果, ?跟打印4次log调用的intern次数和字符串都是基本相同的, ?得到的结果是这个时间和打印log的时间相差在5%以内. 所以验证工作基本结束了. 这里才是真相!

?

三, 总结

前前后后, 我和同事寻找这个问题找了1 2 个月, 才把最后的结果给找了出来, 中途一度因为没有思路而没法跟下去, 但经过坚持最后还是得到了真相, 而且让我对JNI的调用, Java native方法的调用和实现都有了比较深入的认识和学习.

结果就是因为

1,每天我们执行一个统计任务. 统计任务调用商家系统的等价接口返回等价信息

2,fastjson反序列化等价信息的json, 将所有的key缓存到 String的常量池中不释放.

3,Logger.info的时候需要通过Thread.currentTread().getStackTrace() 去定位当前栈信息.

4,Thread.currentTread().getStackTrace() 方法生成StackTrace对象的时候, 会去String常量池中取 类的方法名 类名和文件名, ?因此时间都耗费在链表的查询上了

?

这个问题前后找了1 2个月, 在网上也没有任何跟这个例子有关的文章, 因为详细的记录下来, 避免大家以后碰到此问题比较棘手.

热点排行