| <html> |
| <head> |
| <title>Debug Concurrency Proposal</title> |
| <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> |
| </head> |
| |
| <body bgcolor="#FFFFFF" text="#000000"> |
| <h1>Concurrency in the Debug Platform</h1> |
| <p>Last revised: Nov 26, 2004. Author: Darin Wright.</p> |
| <h2>Problem</h2> |
| <p>Many debug architectures require communicating with a remote debug target. |
| The debug user interface needs to be robust in the face of latent and unreliable |
| connections. To avoid blocking the user interface, the debug platform is exploiting |
| the use of background jobs to perform operations and communicate with debuggers. |
| As the opportunity for concurrent requests against debuggers increases, the |
| platform must provide support for serializing conflicting requests against debuggers |
| - for example a read operation to render a stack frame and a request that would |
| invalidate the stack frame by resuming its thread. The debug platform needs |
| to provide API and a set of guidelines to follow when scheduling concurrent |
| operations and requests against debuggers.</p> |
| <h2>Proposal</h2> |
| <h2><b>The classic cuncurrency issues</b></h2> |
| <p>There are similarities between resources and debug elements and between resource |
| modification and and performing operations on debug elements. Both are hierarchical |
| in nature: a project contains folders and files, a debug target contains threads |
| and stack frames. To support concurrent operations on resources care must be |
| taken to properly lock the appropriate resources for updates. To properly support |
| concurrent operations on debug elements, a similar locking scheme should be |
| used. For example, in order to write to a file, a lock must be obtained on that |
| resource. In order to step in a thread, a lock should be obtained on the thread.</p> |
| <h2><b>Using the platform pattern</b></h2> |
| <p>The resource platform uses scheduling rules to achieve resource locking, via |
| a rule factory that generates rules for different types of operations. Similarly, |
| the debug platform could provide a rule factory to create scheduling rules for |
| operations on debug elements. There are two basic types of operations that can |
| be performed on a debug element: <b>access</b> operations and <b>modify</b> |
| operations.</p> |
| <ul> |
| <li>An access operation simply reads or accesses data from a debug element. |
| Generally, concurrent accesses may be supported by a debug architecture, unless |
| the architecture only allows for serialized requests to the remote target.</li> |
| <li>A modify operation changes the state of a debug element. Generally, only |
| one modify operations cannot be performed concurrently. However, the rules |
| may be different in different architectures. For example, the Java debugger |
| allows for step operations to be performed in different threads at the same |
| time, where as other architectures only allow stepping in one thread.</li> |
| </ul> |
| <h2><b>Specializing concurrency support</b></h2> |
| <p>Since different architectures may have different concurrency capabilities, |
| the rules for performing concurrent operations must be provided by the architecture. |
| The debug platform can provide a pessimistic rule factory as a default implementation |
| for targets that do not provide their own rule factory. Debuggers that want |
| to leverage extra concurrency must provide their own rule factory. The pessimistic |
| rule factory would provide rules for serialized modify and access operations.</p> |
| <ul> |
| <li>A rule factory can be provided using the platform's adapter mechanism. For |
| example, an <font face="Courier New, Courier, mono">IDebugRuleFactory</font> |
| adapter can be provided by an <font face="Courier New, Courier, mono">IDebugTarget</font></li> |
| </ul> |
| <h2><b>What does it mean to the debug platform?</b></h2> |
| <p>The debug platform's actions that modify and access debug elements should do |
| so using the appropriate scheduling rules. Note that the use of scheduling rules |
| does not mean that jobs have to be used. For example, one can simply use the |
| beginRule/endRule support on the job manager. An operation would consult a debug |
| target's rule factory to obtain the appropriate rule. For example, a step operation |
| on a stack frame would obtain the modify rule for the stack frame, and schedule |
| a job to perform the step.</p> |
| <h2><b>What does it mean to the debug implementor?</b></h2> |
| <p>To maintain backwards compatibility, debug implementations are not required |
| to do anything to work in the 3.1 debug platform. By default, a rule factory |
| will be provided that serializes access and modification operations. To enable |
| concurrent operations, a debugger can provide its own rule factory. For example |
| the Java debugger would provide a rule factory to allow concurrent access operations |
| within a debug target, and allow concurrent modification operations in different |
| threads.</p> |
| <p>Debug models may use scheduling rules within their implementations, so that |
| clients calling debug model APIs are enforced to use the proper rules. For example, |
| the step implementation would use beginRule/endRule calls to the job manager |
| when performing a step.</p> |
| <h2><b>Why do we need this?</b></h2> |
| <p>As we add perform more background operations in the debugger, and move communication |
| out of the UI thread to the background, we need a mechanism so support concurrent |
| operations on debuggers, and we have to let debuggers tell us what the concurrency |
| rules are for their architectures.</p> |
| <p>For example, in a step operation the following sequence of events occurs:</p> |
| <ul> |
| <li>The step action initiates a step request on a stack frame (or its owning |
| thread)</li> |
| <li>A RESUME/STEP_OVER event is fired</li> |
| <li>A timer is started in case the step takes longer than 500ms, in which case |
| the debug view is updated to show a long running step.</li> |
| <li>A SUSPEND/STEP_END event is fired</li> |
| <li>The timer is canceled.</li> |
| <li>The debug view refreshes, updating the thread label, and accessing the new |
| stack frames in the thread.</li> |
| <li>The variables view refreshes, accessing the variables in the stack frame |
| and children of any expanded variables. This may require evaluations to compute |
| logical structures. An evaluation may be performed to compute variable toString() |
| values in the details pane, or in line in the variables view, depending on |
| how the view is configured.</li> |
| <li>Source lookup is performed on the top stack frame, and an editor is opened. |
| Source lookup accesses the debug model to resolve a source name for a stack |
| frame. </li> |
| </ul> |
| <p>In the 3.1 debug platform the step request will be performed in a background |
| job. As well, the debug and variables view content providers will retrieve content |
| and generate labels in background jobs. Source lookup will also be performed |
| in a background job. All of this is done to avoid blocking the UI thread while |
| communicating with a debug target. As updating the variables view can trigger |
| evaluations (effectively resuming the thread that has just suspended to compute |
| toString()'s), this can interfere with rendering labels in the debug view, as |
| it invalidates stack frames. As well, the step action re-enables when the step |
| end event is received, allowing the user to initiate another step before the |
| views update. This can cause the views to show partial information, and information |
| in an error state as the thread resumes while labels are being generated. We |
| either want all views to update before we allow another step to occur (via serialization |
| of jobs), or allow the next step request to effectively cancel jobs that it |
| makes obsolete. We may be able to leverage the shouldSchedule/shouldRun support |
| in jobs to cancel jobs that are obsolete.</p> |
| <h2><b>Findings</b></h2> |
| <p>Last updated: Dec 3, 2004</p> |
| <p>Experimenting with a rule factory that clients can use to perform locking when |
| communicating with (accessing and modifying) a debugger revealed the following:</p> |
| <ul> |
| <li>The potential for deadlock increases. Debug model implementations may already |
| use Java's built-in synchronization to implement thread safety. The additional |
| use of scheduling rules as locks increases the chance of deadlocks in the |
| following scenario: a debug element obtains a VM lock on an object, and performs |
| a function that eventually requires a scheduling rule lock while another thread |
| holds the scheduling rule lock, and is waiting for the VM lock. This could |
| be avoided if the same locking mechanism were used by the client and implementation, |
| but that would require debug model implementations to change and publicize |
| their locking implementation.</li> |
| <li>Using the scheduling rule locks is expensive. Each access to a debug model |
| would require the use of an access lock. Thus, simple operations take on larger |
| overhead. For example, generating a label would require the following steps, |
| which will have a performance impact. |
| <ul> |
| <li>retrieving the debug rule factory adapter for a debug element</li> |
| <li>getting an access scheduling rule from the factory for the element</li> |
| <li>locking on the rule</li> |
| <li>computing the label</li> |
| <li>releasing the lock</li> |
| </ul> |
| </li> |
| <li>The client doesn't know when a lock is required. Requiring the use of locks |
| is aggressive, and will no doubt cause locking when no locks are needed. Debug |
| model implementations often cache information in the model to avoid communication |
| with the back end (for example element names, etc.). Thus, clients will perform |
| locking, when the implementation doesn't require it. Only the implementation |
| knows when a lock is required.</li> |
| <li>Client code becomes more complex. Although I was able to reduce the code |
| pattern to a simple try/finally block with a getLock(..) and releaseLock(..) |
| method, code interacting with a debug model grows more complex, and harder |
| to maintain.</li> |
| </ul> |
| <p>Based on these findings, we believe it is up to debug model implementations |
| to assure they are thread safe, performing their own synchronization as needed. |
| Models that require serialized communication with a back end are also required |
| to implement such behavior in their model. In terms of API, this is not a breaking |
| change. In terms of behavior, it is a change that may expose concurrency issues |
| in existing debug model implementations. Debug model implementations need to |
| ensure they are thread safe.</p> |
| <p> </p> |
| </body> |
| </html> |