This document is very much a work in progress. |
1. Intro
c-xrefactory
is a software tool and a project aiming at restoring
the sources of that old, but very useful, tool to a state where it can
be enhanced and be the foundation for a highly capable refactoring
browser.
It is currently in working condition, so you can use it. For information about that, see the README.md.
1.1. Caution
As indicated by the README.md, this is a long term restoration project. So anything you find in this document might be old, incorrect, guesses, temporary holders for thoughts or theories. Or actually true and useful.
Especially names of variables, functions and modules is prone to change as understanding of them increases. They might also be refactored into something entirently different.
As this point this document is just a collection of unstructured thougths, guesses, historic anecdotes and ideas. For example, the previously existing, and also unstructured, wiki pages have just been pasted in here in an attempt to collect everything in a single document. Perhaps that will make it possible to "refactor" the information into something actually useful.
The last part of this document is an Archive where completely obsolete descriptions have been moved for future software archeologists to find.
1.2. Background
You will find some background about the project in the README.md.
This document tries to collect the knowledge and understanding about
how c-xrefactory
actually works, plans for making it better, both in
terms of working with the source, its structure and its features.
Hopefully over time this will be the design documentation of
c-xrefactory
, which, at that time, will be a fairly well structured
and useful piece of software.
1.3. Goal
Ultimately c-xrefactory
could become the refactoring browser for
C, the one that everybody uses. As suggested by @tajmone in GitHub
issue #39, by switching to a general protocol, we could possibly plug
this in to many editors.
However, to do that we need to refactor out the protocol parts. And to do that we need a better structure, and to dare to change that, we need to understand more of the intricacies of this beast, and we need tests. So the normal legacy code catch 22 applies…
Test coverage is starting to look good, coming up to about 80% at the time of writing this. But that is mostly "application level" execution, rather than actual tests.
2. Context
c-xrefactory
is designed to be an aid for programmers as they write,
edit, inspect, read and improve the code they are working on.
The editor is used for usual manual manipulation of the source
code. C-xrefactory
interacts with the editor to provide navigation
and automated edits, refactorings, through the editor.
3. Functional Overview
The c-xref program is actually a mish-mash of a multitude of features baked into one program. This is the major cause of the mess that it is source-wise.
It was
-
a generator for persistent cross-reference data
-
a reference server for editors, serving cross-reference, navigational and completion data over a protocol
-
a refactoring server (the worlds first to cross the Refactoring Rubicon)
-
an HTML cross-reference generator (probably the root of the project) (REMOVED)
-
a C macro generator for structure fill (and other) functions (REMOVED)
It is the first three that are unique and constitutes the great value of this project. The last two have been removed from the source, the last one because it was a hack and prevented modern, tidy, building, coding and refactoring. The HTML cross-reference generator has been superseeded by modern alternatives like Doxygen and is not at the core of the goal of this project.
One might surmise that it was the HTML-crossreference generator that
was the initial purpose of what the original Xrefactory
was based
upon. Once that was in place the other followed, and were basically
only bolted on top without much re-architecting the C sources.
What we’d like to do is partition the project into separate parts, each having a clear usage.
The following sections are aimed at describing various features of
c-xrefactory
.
3.1. Options, option files and configuration
The current version of C-xrefactory allows only two possible sets of configuration/options.
The primary storage is (currently) the file $HOME/.c-xrefrc
which stores the "standard" options for all projects. Each project has
a separate section which is started with a section marker, the project
name surrounded by square brackets, [project1]
.
When you start c-xref
you can use the command line option -xrefrc
to request that a particular option file should be used instead of the
"standard options".
When running the edit server there seems to be no way to
indicate a different options files for different
projects/files. Although you can start the server with -xrefrc you
will be stuck with that in the whole session and for all projects.
|
3.2. LSP
The LSP protocol is a common protocol for language servers such as
clangd
and c-xrefactory
. It allows an editor (client) to interface
to a server to request information, such as reference positions, and
operations, such as refactorings, without knowing exactly which server
it talks to.
Recent versions of c-xrefactory
have an initial implementation of a
very small portion of the LSP protocol. The plan is to fully integrate
the functionality of c-xrefactory
into the LSP protocol. This will
allow use of c-xrefactory
from not only Emacs but also Visual Studio
Code or any other editor that supports the LSP protocol.
4. Quality Attributes
The most important quality attributes are
-
correctness - a refactoring should never alter the behaviour of the refactored code
-
completness - no reference to a symbol should ever be missed
-
performance - a refactoring should be sufficiently quick so the user keeps focus on the task at hand
5. Constraints
TBD.
6. Principles
6.1. Reference Database and Parsing
The reference database is used only to hold externally visible identifiers to ensure that references to an identifier can be found across all files in the used source.
All symbols that are only visible inside a unit is handled by reparsing the file of interest.
6.2. TBD.
7. Software Architecture
7.1. Container View
7.2. Containers
At this point the description of the internal structure of the containers are tentative. The actual interfaces are not particularly clean, most code files can and do include much every other module.
One focus area for the ongoing work is to try to pry out modules/components from the code mess by moving functions around, renaming and hiding functions, where possible. |
7.2.1. CxrefCore
cxrefCore
is the core container. It does all the work when it comes
to finding and reporting references to symbols, communicating
refactoring requests as well as storing reference information for
longer term storage and caching.
Although c-xref
can be used as a command line tool, which can be
handy when debugging or exploring, it is normally used in "server"
mode. In server mode the communication between the editor extension
and the cxrefCore container is a back-and-forth communication using a
non-standard protocol over standard pipes.
The responsibilities of cxrefCore
can largely be divided into
-
parsing source files to create, maintain the references database which stores all inter-module references
-
parsing source files to get important information such as positions for a functions begin and end
-
managing editor buffer state (as it might differ from the file on disc)
-
performing symbol navigation
-
creating and serving completion suggestions
-
performing refactorings such as renames, extracts and parameter manipulation
At this point it seems like refactorings are performed as separate
invocations of c-xref
rather than through the server interface.
7.2.2. EditorExtension
The EditorExtension
container is responsible for plugging into an
editor of choice and handle the user interface, buffer management and
executing the refactoring edit operations.
Currently there is only one such extension supported, for Emacs
,
although there existed code, still available in the repo history, for
an extension for jEdit
which hasn’t been updated, modified or checked
for a long time and no longer is a part of this project.
7.2.3. ReferencesDB
The References database stores crossreferencing information for symbols visible outside the module it is defined in. Information about local/static symbols are not stored but gathered by parsing that particular source file on demand.
Currently this information is stored in a somewhat cryptic, optimized text format.
This storage can be divided into multiple files, probably for faster access. Symbols are then hashed to know which of the "database" files it is stored in. As all crossreferencing information for a symbol is stored in the same "record", this allows reading only a single file when a symbol is looked up.
8. Code
8.1. Commands
The editorExtension calls the server using command line
options. These are then converted to first a command enum starting in
OLO
("on-line operation") or AVR
("available refactoring").
Some times the server needs to call the crossreferencer which is
performed in the same manner, command line options, but this call is
internal so the wanted arguments are stored in a vector which is
passed to the xref()
in the same manner as main()
passes the
actual argc
/argv
.
Many of the commands require extra arguments like positions/markers
which are passed in as extra arguments. E.g. a rename requires the
name to rename to which is sent in the renameto=
option, which is
argparsed and stored in the option structure.
Some of these extra arguments are fairly random, like -olcxparnum=
and -olcxparnum2=
. This should be cleaned up.
A move towards "events" with arguments would be helpful. This would mean that we need to:
-
List all "events" that
c-xref
need to handle -
Define the parameters/extra info that each of them need
-
Clean up the command line options to follow this
-
Create event structures to match each event and pass this to
server
,xref
andrefactory
-
Rebuild the main command loop to parse command line options into event structures
8.2. Passes
There is a variable in main()
called firstPassing
which is set and passed
down through mainEditServer()
until it is reset in
mainFileProcessingInitialisations()
after initCaching()
.
This is probably related to the fact that c-xref
allows for passing
over the analyzed source multiple passes in case you compile the
project sources with different C defines. Variables in the c-xref
sources indicate this, e.g the loops in mainEditServerProcessFile()
and mainXrefProcessInputFile()
(which are both strangely limited by
setting the maxPass variable to 1 before entering the loop…).
8.3. Parsers
C-xref uses a patched version of Berkley yacc to generate parsers. There are a number of parsers
-
C
-
Yacc
-
C expressions
There might also exist small traces of the Java parser, which was
previously a part of the free c-xref
, and the C++ parser that
existed but was proprietary.
The patch to byacc is mainly to the skeleton and seems to relate mostly to handling of errors and adding a recursive parsing feature that is required for Java, which was supported previously. It is not impossible that this patch might not be necessary now that Java parsing is not necessary, but this has not been tried.
Some changes are also made to be able to accomodate multiple parsers in the same executable, mostly solved by CPP macros renaming the parsing datastructures so that they can be accessed using the standard names in the parsing skeleton. The Makefile generates the parsers and renames the generated files as appropriate.
8.4. Refactoring and the parsers
Some refactorings need more detailed information about the code, maybe all do?
One example, at least, is parameter manipulation. Then the refactorer
calls the appropriate parser (serverEditParseBuffer()
) which
collects information in the corresponding semantic actions. This
information is stored in various global variables, like
parameterBeginPosition
.
The parser is filling out a ParsedInfo structure which conveys information that can be used e.g. when extracting functions etc.
At this point I don’t understand exactly how this interaction is performed, there seems to be no way to parse only appropriate parts, so the whole file need to be re-parsed. |
Findings:
-
some global variables are set as a result of command line and arguments parsing, depending on which "command" the server is acting on
-
the semantic rules in the parser(s) contains code that matches these global variables and then inserts special lexems in the lexem stream
One example is how a Java 'move static method' was performed. It
requires a target position. That position is transferred from command
line options to global variables. When the Java parser was parsing a
class or similar it (or rather the lexer) looks at that "ddtarget
position information" and inserts a OL_MARKER_TOKEN
in the stream.
TODO: What extra "operation" the parsing should perform and return data for should be packaged into some type of "command" or parameter object that should be passed to the parser, rather than relying on global variables. |
8.5. Reading Files
Here are some speculations about how the complex file reading is structured.
Each file is identified by a filenumber, which is an index into the
file table, and seems to have a lexBuffer
tied to it so that you can
just continue from where ever you were. That in turn contains a
CharacterBuffer
that handles the actual character reading.
And there is also an "editorBuffer"…
The intricate interactions between these are hard to follow as the code here are littered with short character names which are copies of fields in the structures, and infested with many macros, probably in an ignorant attempt at optimizing. ("The root of all evil is premature optimization" and "Make it work, make it right, make it fast".)
It seems that everything start in initInput()
in yylex.c
where the
only existing call to fillFileDescriptor()
is made. But you might
wonder why this function does some initial reading, this should be
pushed down to the buffers in the file descriptor.
8.5.1. Lexing/scanning
Lexing/scanning is performed in two layers, one in lexer.c
which
seems to be doing the actual lexing into lexems which are put in a
lexembuffer. This contains a sequence of encoded and compressed
symbols which first has a LexemCode
which is followed by extra data,
like Position
. These seems to always be added but not always necessary.
The higher level "scanning" is performed, as per ususal,
by yylex.c
. lexembuffer
defines some functions to put and get
lexems, chars (identifiers and file names?) as well as integers and
positions.
At this point the put/get lexem functions take a pointer to a pointer to chars (which presumably is the lexem stream in the lexembuffer) which it also advances. This requires the caller to manage the LexemBuffer’s internal pointers outside and finally set them right when done.
It would be much better to call the "putLexem()"-functions with a
lexemBuffer but there seems to be a few cases where the destination
(often dd
) is not a lexem stream inside a lexemBuffer. These might
be related to macro handling or caching.
This is a work-in-progress. Currently most of the "normal" usages are prepared to use the LexemBuffer’s pointers. But the handling of macros and defines are cases where the lexems are not put in a LexemBuffer. See the TODO.org for current status of this Mikado sequence. |
8.5.2. Semantic information
As the refactoring functions need some amount of semantic information,
in the sense of information gathered during parsing, this information
is collected in various ways when c-xref
calls the "sub-task" to do
the parsing required.
Two structures hold information about various things, among which are
the memory index at certain points of the parsing. Thus it is possible
to verify e.g. that a editor region does not cover a break in block or
function structure. This structure is, at the point of writing, called
parsedInfo
and definitely need to be tidied up.
8.6. Reference Database
c-xref
run in "xref" mode creates, or updates, a database of
references for all externally visible symbols it encounters.
A good design should have a clean and generic interface to the reference database, but this is still a work in progress to chisel this out.
8.6.1. CXFILE
The current implementation of the reference database is file based, with an optimized storage format.
There is limited support to automatically keep these updated during an edit-compile cycle, you might have to update manually now and then.
The project settings (or command line options) indicate where the
file(s) are created and one option controls the number of files to be
used, -refnum
.
This file (or files) contains compact, but textual representations of the cross-reference information. Format is somewhat complex, but here are somethings that I think I have found out:
-
the encoding has single character markers which are listed at the top of cxfile.c
-
the coding seems to often start with a number and then a character, such as '4l' (4 ell) means line 4, 23c mean column 23
-
references seems to be optimized to not repeat information if it would be a repetition, such as '15l3cr7cr' means that there are two references on line 15, one in column 3 the other in column 7
-
so there is a notion of "current" for all values which need not be repeated
-
e.g. references all use 'fsulc' fields, i.e. file, symbol index, usage, line and column, but do not repeat a 'fsulc' as long as it is the same
-
some "fields" have a length indicator before, such as filenames ('6:/abc.c') indicated by ':' and version information ('34v file format: C-xrefactory 1.6.0 ') indicated by 'v'.
So a line might say
12205f 1522108169p m1ia 84:/home/...
The line identifies the file with id 12205. The file was last included in an update of refs at sometime which is identified by 1522108169 (mtime), has not been part of a full update of xrefs, was mentioned on the command line. (I don’t know what the 'a' means…) Finally, the file name itself is 84 characters long.
TODO: Build a tool to decipher this so that tests can query the generated data for expected data. This is now partly ongoing in the 'utils' directory. |
8.6.2. Reference Database Reading
All information about an externally visible symbol is stored in one, and only one reference file, determined by hashing the linkname of the symbol. So it will always suffice to read one reference file when consulting the reference database (in the form of CXFILE) for a symbol.
The reading of the CXFILE format is controlled by `scanFunctionTable`s. These consists of a list of entries, one for each key/tag/recordCode (see format description above) that the scan will process.
As the reference file reading encounters a key/tag/recordCode it will consult the table and see if there is an entry pointing to a handler function for that key/tag/recordCode. If so, it will be called.
8.7. Editor Plugin
The editor plugin has three different responsibilities:
-
serve as the UI for the user when interacting with certain
c-xref
related functions -
query
c-xref server
for symbol references and support navigating these in the source -
initiate source code operations ("refactorings") and execute the resulting edits
Basically Emacs (and probably other editors) starts c-xref
in
"server-mode" using -server
which connects the editor
with c-xref
through stdout/stdin. If you have (setq
c-xref-debug-mode t)
this command is logged in the *Messages*
buffer
with the prefix "calling:".
Commands are sent from the editor to the server on its standard input.
They looks very much like normal command line options, and in fact
c-xref
will parse that input in the same way using the same
code. When the editor sends an end-of-options
line, the server will
start executing whatever was sent, and return some information in the
file given as an -o
option when the editor starts the c-xref
server process. The file is named and created by the editor and
usually resides in /tmp
. With c-xref-debug-mode
set to on this is
logged as "sending:". If you (setq c-xref-debug-preserve-tmp-files
t)
Emacs will also not delete the temporary files it creates so that
you can inspect them afterwards.
When the server has finished processing the command and placed the
output in the output file it sends a <sync>
reply.
The editor can then pick up the result from the output file and do what it needs to do with it ("dispatching:").
8.7.1. Invocations
The editor invokes a new c-xref
process for the following cases:
-
Refactoring
Each refactoring operation calls a new instance of
c-xref
? -
Create Project
When a
c-xref
function is executed in the editor and there is no project covering that file, an interactive "create project" session is started, which is run by a separatec-xref
process.
8.7.2. Buffers
There is some magical editor buffer management happening inside of
c-xref
which is not clear to me at this point. Basically it looks
like the editor-side tries to keep the server in sync with which
buffers are opened with what file…
At this point I suspect that -preload <file1> <file2>
means that the
editor has saved a copy of <file1>
in <file2>
and requests the server
to set up a "buffer" describing that file and use it instead of the
<file1>
that recides on disk.
This is essential when doing refactoring since the version of the file
most likely only exists in the editor, so the editor has to tell the
server the current content somehow, this is the -preload
option.
8.8. Editor Server
When serving an editor the c-xrefactory application is divided into
the server, c-xref and the editor part, at this point only Emacs:en
are supported so that’s implemented in the
editor/Emacs
-packages.
8.8.1. Interaction
The initial invocation of the edit server creates a process with which communication is over stdin/stdout using a protocol which from the editor is basically a version of the command line options.
When the editor has delivered all information to the server it sends
'end-of-option' as a command and the edit server processes whatever it
has and responds with <sync>
which means that the editor can fetch
the result in the file it named as the output file using the '-o'
option.
As long as the communication between the editor and the server is open, the same output file will be used. This makes it hard to catch some interactions, since an editor operation might result in multiple interactions, and the output file is then re-used. |
Setting the emacs variable c-xref-debug-mode
forces the editor to
copy the content of such an output file to a separate temporary file
before re-using it.
For some interactions the editor starts a completely new and fresh
c-xref
process, see below. And actually you can’t do refactorings
using the server, they have to be separate calls. (Yes?) I have yet to
discover why this design choice was made.
There are many things in the sources that handles refactorings
separately, such as refactoring_options , which is a separate copy of
the options structure used only when refactoring.
|
8.8.2. Protocol
Communication between the editor and the server is performed using
text through standard input/output to/from c-xref. The protocol is
defined in src/protocol.tc and must match editor/emacs/c-xrefprotocol.el
.
The definition of the protocol only caters for the server→editor part, the editor→server part consists of command lines resembling the command line options and arguments, and actually is handled by the same code.
The file protocol.tc
is included in protocol.h
and protocol.c
which generates definitions and declarations for the elements through
using some macros.
There is a similar structure with c-xrefprotocol.elt which
includes protocol.tc to wrap the PROTOCOL_ITEMs into
defvar
s.
There is also some Makefile trickery that ensures that the C and elisp impementations are in sync.
8.8.3. Invocation of server
The editor fires up a server and keeps talking over the established channel (elisp function 'c-xref-start-server-process'). This probably puts extra demands on the memory management in the server, since it might need to handle multiple information sets and options (as read from a .cxrefrc-file) for multiple projects simultaneously over a longer period of time. (E.g. if the user enters the editor starting with one project and then continues to work on another then new project options need to be read, and new reference information be generated, read and cached.)
TODO: Figure out and describe how this works by looking at the elisp-sources. |
FINDINGS:
-
c-xref-start-server-process in c-xref.el
-
c-xref-send-data-to-running-process in c-xref.el
-
c-xref-server-call-refactoring-task in c-xref.el
8.8.4. Communication Protocol
The editor server is started using the appropriate command line option and then it keeps the communication over stdin/stdout open.
The editor part sends command line options to the server, which looks something like (from the read_xrefs test case):
-encoding=european -olcxpush -urldirect "-preload" "<file>" "-olmark=0" "-olcursor=6" "<file>" -xrefrc ".c-xrefrc" -p "<project>" end-of-options
In this case the "-olcxpush" is the operative command which results in the following output
<goto> <position-lc line=1 col=4 len=66>CURDIR/single_int1.c</position-lc> </goto>
As we can see from this interaction, the server will handle (all?) input as a command line and manage the options as if it was a command line invocation.
This explains the intricate interactions between the main program and the option handling.
The reason behind this might be that a user of the editor might be editing files on multiple projects at once, so every interrogation/operation needs to clearly set the context of that operation, which is what a user would do with the command line options.
8.8.5. OLCX Naming
It seems that all on-line editing server functions have an olcx
prefix, "On-Line C-Xrefactory", maybe…
8.9. Refactoring
This is of course, the core in why I want to restore this, to get at its refactoring capabilities. So far, much is not understood, but here are some bits and pieces.
8.9.1. Editor interface
One thing that really confused me in the beginning was that the editor, primarily Emacs, don’t use the actual server that it has started for refactoring operations (and perhaps for other things also?). Instead it creates a separate instance with which it talks to about one refactoring.
I’ve just managed to create the first automatic test for refactorings, olcx_refactory_rename
. It was created by running the sandboxed emacs to record the communication and thus finding the commands to use.
Based on this learning it seems that a refactoring typically is a single invocation of c-xref
with appropriate arguments (start & stop markers, the operation, and so on) and the server then answers with a sequence of operations, like
<goto>
<position-off off=3 len=<n>>CURDIR/test_source/single_int1.c</position-off>
</goto>
<precheck len=<n>> single_int_on_line_1_col_4;</precheck>
<replacement>
<str len=<n>>single_int_on_line_1_col_4</str> <str len=<n>>single_int_on_line_1_col_44</str>
</replacement>
8.9.2. Interactions
I haven’t investigated the internal flow of such a sequence, but it is starting to look like c-xref
is internally re-reading the initialization, I’m not at this point sure what this means, I hope it’s not internal recursion…
8.9.3. Extraction
Each type of refactoring has it’s own little "language". E.g. extracting a method/function using -refactory -rfct-extract-function
will return something like
<extraction-dialog type=newFunction_> <str len=20> newFunction_(str);
</str>
<str len=39>static void newFunction_(char str[]) {
</str>
<str len=3>}
</str>
<int val=2 len=0></int>
</extraction-dialog>
So there is much logic in the editor for this. I suspect that the three <str>
parts are
-
what to replace the current region with
-
what to place before the current region
-
what to place after the current region
If this is correct then all extractions copy the region verbatim and then the server only have to figure out how to "glue" that to a semantically correct call/argument list.
As a side note the editor asks for a new name for the function and then calls the edit server with a rename request (having preloaded the new source file(s) of course).
8.9.4. Protocol
Dechiffrering the interaction between an editor and the edit server in
c-xrefactory
isn’t easy. The protocol isn’t very clear or
concise. Here I’m starting to collect the important bits of the
invocation, the required and relevant options and the returned
information.
The test cases for various refactoring operations should give you some more details.
All of these require a -p
(project) option to know which c-xref
project options to read.
General Principles
Refactorings are done using a separate invocation, the edit server mode cannot handle refactorings. At least that is how the Emacs client does it (haven’t looked at the Jedit version).
I suspect that it once was a single server that did both the symbol
management and the refactoring as there are remnants of a separate
instance of the option structure named "refactoringOptions". Also the
check for the refactoring mode is done using
options.refactoringRegime == RegimeRefactory
which seems strange.
Anyway, if the refactoring succeeds the suggested edits is as per usual in the communications buffer.
However, there are a couple of cases where the communcation does not end there. Possibly because the client needs to communicate some information back before the refactoring server can finish the job, like presenting some menu selection.
My guess at this point is that it is the refactoring server that closes the connection when it is done…
Rename
Invocation: -rfct-rename -renameto=NEW_NAME -olcursor=POSITION FILE
Semantics: The symbol under the cursor (at POSITION in FILE) should be renamed (replaced at all occurrences) by NEW_NAME.
Result: sequence of
<goto>
<position-off off=POSITION len=N>FILE</position-off>
</goto>
<precheck len=N>STRING</precheck>
followed by sequence of
<goto>
<position-off off=POSITION len=N>FILE</position-off>
</goto>
<replacement>
<str len=N>ORIGINAL</str> <str len=N>REPLACEMENT</str>
</replacement>
Protocol Messages
- <goto>{position-off}</goto> → editor
-
Request the editor to move cursor to the indicated position (file, position).
- <precheck len={int}>{string}</precheck> → editor
-
Requests that the editor verifies that the text under the cursor matches the string.
- <replacement>{str}{str}</replacement>
-
Requests that the editor replaces the string under the cursor, which should be 'string1', with 'string2'.
- <position-off off={int} len={int}>{absolute path to file}</position-off>
-
Indicates a position in the given file. 'off' is the character position in the file.
8.9.5. Memory handling
Memory may be dynamically allocated using malloc()
in which case it
must be managed in the same manner as all malloced
memory should to
avoid memory leaks or pointer problems. This is used mostly for local
and temporary areas.
But memory can also be locally managed using the structure Memory
and related functions.
The Memory type
Memory allocation using the Memory
type allows managing memory
locally and separately depending it its use. E.g. the primary memory
is cxMemory
where all collected references are stored including
reference tables and other management areas. This type of memory can
easily be discarded, e.g. when a file is completely analyzed or a
refactoring is complete.
Separate memory areas are managed through ppmMemory
,
fileTableMemory
, macroArgumentsMemory` and macroBodyMemory
and
possibly others. This list indicate that this type of memory is used
because the amount of source to be analysed may be so large that it
does not fit at the same time and need to be cached/discarded and
restored as needed.
The Memory
type allows both re-initializing with a different size
and the optional choice to be notified when overflow happens using an
overflowHandler
function.
8.9.6. Option Memory
The memory handling for options deserves special explanation and attention.
When defining options, from the command line, options file or piped
from an editor process, the strings need to be preserved and
stored. This is done by "dynamically" allocating such areas in the
"options memory", optMemory
.
But since this is a integral part of the options structure, whenever
an Options
structure is copied, special care has to be taken so that
the fields in the target structure points into the memory area of the
target structure and not, as they did in the original structure, into
the memory of the source structure.
There are functions that, through tricky memory arithmetic, adjust all
pointers to point correctly. To this end, all memory locations in an
Options
structure are collected in a linked list which can be
traversed.
the nodes in the linked list are also allocated in the "dynamic" memory of the Options structure. |
8.10. Configuration
8.10.1. Options
There are three possible sources for options.
-
Configuration files (~/.c-xrefrc)
-
Piped options sent to edit server
-
Command line options
Not all options are relevant in all cases.
All options sources uses exactly the same format so that the same code for decoding them can be used.
8.10.2. Logic
When the editor has a file open it needs to "belong" to a project. The logic for finding which is very intricate and complicated.
In this code there is also checks for things like if the file is already in the index, if the configuration file has changed since last time, indicating there are scenarios that are more complicated (the server, obviously).
But I also think this code should be simplified a lot.
9. Modules
The current state of c-xrefactory
is not such that clean modules can
easily be identified and located. This is obviously one important goal
of the continuing refactoring work.
To be able to do that we need to understand the functionality enough so that clusters of code can be refactored to be more and more clear in terms of responsibilities and interfaces.
This section makes a stab at identifying some candidated to modules,
as illustrated by the component diagram for cxrefCore
.
9.1. Yylex
9.1.1. Responsibilities
-
Transform source text to sequences of lexems and additional information
-
Register and apply C pre-processor macros and defines as well as defines made as command line options (
-D
) -
Handle include files by pushing and poping read contexts
9.1.2. Interface
The yylex
module has the standard interface required by any
yacc
-based parser, which is a simple yylex(void)
function.
9.2. Parser
9.3. Xref
9.4. Server
9.5. Refactory
9.6. Cxref
9.7. Main
9.8. Cxfile
9.8.1. Responsibilities
Read and write the CXref database in "plain" text format.
9.8.2. File format
The current file format for the cross-reference data consists of records with the general format
<number><key>[<value>]
There are two important types of lines, a file information line and a symbol information line.
The actual keys are documented in cxfile.c
, but here is an example
file information line:
32571f 1715027668m 21:/usr/include/ctype.h
First we have two simple value/key pairs. We see "32571f" indicating that this is file information for file with file number 32571.
Secondly we have "1715027668m". This is the modification time of the file which is stored to be able to see if that file has been updated since the reference database was last written.
And the third part is "21:/usr/include/ctype.h", which is of a record type that is a bit more complex. The number is the length of the value. The ':' indicates that the record is a filename.
9.9. Caching
9.10. c-xref.el
9.11. c-xrefactory.el
10. Data Structures
There are a lot of different data structures used in c-xrefactory
.
This is a first step towards visualising some of them.
10.1. ReferenceItems and References
This is a fundamental pair. A ReferenceItem
is the symbol that
occurs at the locations that References
indicate.
Each ReferenceItem
has a linked list of `Reference`s, each denoting
one occurence of that "symbol".
10.2. Symbols and References
There is also a structure called Symbol
. But, why is there no
connection between the symbols and the references?!?
So what are these actually?
10.3. Files and Buffers
Many strange things are going on with reading files so that is not completely understood yet.
(There should probably be a section caching and one on lexing…)
Here is an initial attempt at illustrating how some of the file and text/lexem buffers are related.
It would be nice if the LexInput structure could point to a
LexemBuffer instead of holding separate pointers which are
impossible to know what they actually point to…
|
This could be achieved if we could remove the CharacterBuffer from LexemBuffer and make that a reference instead of a composition. Then we’d need to add a CharacterBuffer to the structures that has a LexemBuffer as a component (if they use it). |
10.4. Modes
c-xrefactory
operates in different modes ("regimes" in original
c-xref
parlance):
-
xref - batch mode reference generation
-
server - editor server
-
refactory - refactory browser
The default mode is "xref". The command line options -server
and -refactory
selects one of the other modes. Branching is done in the final lines in
main()
.
The code for the modes are intertwined, probably through re-use of already existing functionality when extending to a refactoring browser.
One evidence for this is that the refactory module calls the "main task" as a "sub-task". This forces some intricate fiddling with the options data structure, like copying and caching it. Which I don’t fully understand yet.
TODO?: Strip away the various "regimes" into more separated concerns and handle options differently.
10.5. Options
The Options
datastructure is used to collect options from the
command line as well as from options/configuration files and piped
options from the editor client using process-to-process
communication.
It consists of a collection of fields of the types
-
elementary types (bool, int, …)
-
string (pointers to strings)
-
lists of strings (linked lists of pointers to strings)
10.5.1. Allocation & Copying
Options has its own allocation using optAlloc
which allocates in a
separate area, currently part of the options structure and utilizing
"dynamic allocation" (dm_
functions on the Memory
structure).
The Options structure are copied multiple times during a session, both
as a backup (savedOptions
) and into a separate options structure
used by the Refactorer (refactoringOptions
).
Since the options memory is then also copied, all pointers into the options memory need to be updated. To be able to do this, the options structure contains lists of addresses that needs to by "shifted".
When an option with a string or string list value is modified the option is registered in either the list of string valued options or the list of string list valued options. When an options structure is copied it must be performed using a deep copy function which "shifts" those options and their values (areas in the options memory) in the copy so that they point into the memory area of the copy, not the original.
After the deep copy the following point into the option memory of the copy
-
the lists of string and string list valued options (option fields)
-
all string and string valued option fields that are used (allocated)
-
all list nodes for the used option (allocated)
-
all list nodes for the string lists (allocated)
11. Algorithms
The code does not always explain the algorithms that it implements. This chapter will ultimately be a description of various algorithms used by c-xrefactory.
11.1. How is an Extract refactoring performed?
The region (mark and point/cursor positions) is sent to the c-xref
server in a -refactory -rfct-extract
command.
The server parses the relevant file and sets some information that can be used in some prechecks that are then performed, such as structure check, and then the server answers with
<extraction-dialog>
<str .... /str>
<str .... /str>
<str .... /str>
</extraction-dialog>
The first string is the code that will replace the extracted code, such as a call to the extracted function. The second string is the header part that will preceed the extracted code ("preamble"), and the third is then of course any code that needs to go after the extracted code ("postamble").
The actual code in the region is never sent to, or returned from, the server. This is handled completely by the editor extension, and used verbatim (except if it is a macro that is extracted, in which case each line is terminated by the backslash) so no changes to that code can be made.
The pre- and post-ambles might be of varying complexity. E.g. when extracting a macro, the postamble can be completely empty. When extracting a function both may contain code to transfer and restore parameters into local variables to propagate in/out variables as required.
-
The editor then requests a name from the user that it will use in a rename operation that renames the default named function/macro/variable.
11.2. How does …
TBD. == Development Environment
11.3. Developing, here be dragons…
First the code is terrible, lots of single and double character
variables (cc
, ccc
, ..) and lost of administration on local
variables rather than the structures that are actually there. And
there are also a lot of macros. Unfortunately macros are hard to
refactor to functions. (But I’m making progress…)
As there is no general way to refactor a macro to a function, various techniques must be applied. I wrote a blog post about one that have been fairly successful.
But actually it’s rather fun to be able to make small changes and see the structure emerge, hone your refactoring and design skills, and working on a project that started 20 years ago which still is valuable, to me, and I hope, to others.
There should probably be a whole section on how to contribute and
develop c-xrefactory
but until then here’s a short list of what
you need:
-
C development environment (GNU/Clang/Make/…)
-
Unittests are written using
Cgreen
-
Clean code and refactoring knowledge (to drive the code to a better and cleaner state)
Helpful would be:
-
Compiler building knowledge (in the general sense, Yacc, but AST:s and symbol table stuff are heavily used)
11.4. Setup
TBD.
11.5. Building
You should be able build c-xref
using something like (may have changed over time…)
cd src make make unit make test
But since the details of the building process are somewhat contrieved and not so easy to see through, here’s the place where that should be described.
One step in the build process was generating initialization information
for all the things in standard include files, which of course became
very dependent on the system you are running this on. This has now moved
into functions inside c-xref
itself, like finding DEFINEs and include
paths.
The initial recovered c-xrefactory relied on having a working c-xref for the current system. I don’t really know how they managed to do that for all the various systems they were supporting.
Modern thinking is that you should always be able to build from source, so this is something that needed change. We also want to distribute c-xref as an el-get library which requires building from source and should generate a version specific for the current system.
The strategy selected, until some better idea comes along, is to try to build a c-xref.bs, if there isn’t one already, from the sources in the repository and then use that to re-generate the definitions and rebuild a proper c-xref. See Bootstrapping.
We have managed to remove the complete bootstrapping step, so c-xrefactory
now builds like any other project.
11.6. Versions
The current sources are in 1.6.X range. This is the same as the orginal xrefactory and probably also the proprietary C++ supporting version.
There is an option, "-xrefactory-II", that might indicate that
something was going on. But currently the only difference seems to be
if the edit server protocol output is in the form of non-structured
fprintf:s or using functions in the ppc
-family (either calling
ppcGenRecord()
or `fprint`ing using some PPC-symbol). This, and
hinted to in how the emacs-part starts the server and some initial
server option variables in refactory.c, indicates that the
communication from the editor and the refactory server is using
this. It does not look like this is a forward to next generation
attempt.
What we should do is investigate if this switch actually is used anywhere but in the editor server context, and if so, if it can be made the default and the 'non-xrefactory-II' communication removed.
11.7. Coding
11.7.1. Naming
C-xref started (probably) as a cross-referencer for the languages supported (C, Java, C++), orginally had the name "xref" which became "xrefactory" when refactoring support was added. And when Mariàn released a "C only" version in 2009 most of all the "xref" references and names was changed to "c-xref". So, as most software, there is a history and a naming legacy to remember.
Here are some of the conventions in naming that are being used:
- olcx
-
"On-line CX" (Cross-reference) ?
- OLO
-
"On-line option" - some kind of option for the server
11.7.2. Modules and Include Files
The source code for c-xrefactory
was using a very old C style with a
separate proto.h
where all prototypes for all externally visible
functions were placed. Definitions are all over the place and it was
hard to see where data is actually declared. This must change into
module-oriented include-strategy.
Of course this will have to change into the modern x.h/x.c externally visible interface model so that we get clean modules that can be unittested.
The function prototypes have been now moved out to header files for each "module". Some of the types have also done that, but this is still a work in progress.
11.8. Debugging
TBD. Attachning gdb
, server-driver
…
yaccp
from src/.gdbinit
can ease the printing of Yacc semantic data fields…
A helpful option is the recently added -commandlog=…
which allows
you to capture all command arguments sent to the server/xref process
to a file. This makes it possible to capture command sequences and
"replay" them. Useful both for debugging and creating tests.
11.9. Testing
11.9.1. Unittests
There are very few unittests at this point, only covering single digit percent of the code. The "units" in this project are unclear and entangled so creating unittests is hard since it was not build to be tested, test driven or even clearly modularized.
All unittests use Cgreen
as the unittest framework. If you are
unfamiliar with it the most important point is that it can mock
functions, so you will find mock implementations of all external
functions for a module in a corresponding <module>.mock
file.
Many modules are at least under test, meaning there is a <module>_tests.c in the unittest directory. Often only containing an empty test.
11.9.2. Acceptance Tests
In the tests
directory you will find tests that exercise the external
behaviour of c-xref
. Some tests actually do only that, they wouldn’t
really count as tests.
Most acceptance tests are hacks at this point, Make-scripts tweaked until it produces some expected output. But at least they get the coverage up (working our way up to the mid 60%), and more are added as bugs are found so they provide increasing confidence when developing.
There are two basic strategies for the tests:
-
run a
c-xref
command, catch its output and verify -
run a series of command using the EDIT_SERVER_DRIVER, collect output and results and verify
Some tests do not even test its output and only provide coverage.
Some tests do a very bad job at verifying, either because my understanding at that time was very low, or because it is hard to verify the output. E.g. the "test" for generate references are only grepping the CXrefs files for some strings, not verifying that they actually point to the correct place.
Hopefully this will change as the code gets into a better state and the understanding grows.
11.9.3. General Setup
Since all(?) c-xref
operation rely on an options file which must
contain absolute file paths (because the server runs as a separate
process) it must be generated whenever the tests are to be run in a
different location (new clone, test was renamed, …).
This is performed by using a common template in tests
and a target
in tests/Maefile.boilerplate
.
Each test should have a clean
target that removes any temporary and
generated files, including the .c-xrefrc
file and generated
references. This way it is easy to ensure that all tests have updated
.c-xrefrc
files.
11.9.4. Edit Server Driver Tests
Since many operations are performed from the editor, and the editor starts an "edit server" process, many tests need to emulate this behaviour.
The edit server session is mostly used for navigation. Refactorings
are actually performed as separate invocations of c-xref
.
In utils
there is a server_driver.py
script, which will take as
input a file containing a sequence of commands. You can use this to
start an edit, refactory or reference server session and then feed it
with commands in the same fashion as an editor would do. The script
also handles the communication through the buffer file (see [Editor
Interface](./Design:-Editor-Interface)).
11.9.5. Creating More Edit Server Tests
You can relatively easy re-create a sequence of interactions by using the
sandboxed Emacs in tests/sandboxed_emacs
.
There are two ways to use it, "make spy" or "make pure". With the "spy" an intermediate spy is injected between the editor and the edit server, capturing the interaction to a file.
With "pure" you just get the editor setup with c-xref-debug-mode
and
c-xref-debug-preserve-tmp-files
on. This means that you can do what
ever editor interactions you want and see the communication in the
*Messages*
buffer. See [Editor Interface](./Design:-Editor-Interface)
for details.
Once you have figure out which part of the *Messages*
buffer are
interesting you can copy that out to a file and run
utils/messages2commands.py
on it to get a file formatted for input
to server_driver.py
.
the messages2commands script converts all occurrences of the
current directory to CURDIR so it is handy to be in the same directory
as the sources when you run the conversion.
|
the messages2commands script removes any -preload so you
need to take care that the positions inside the buffers are not
changed between interactions lest the -olcursor and -olmark will
be wrong. (You can just undo the change after a refactoring or
rename). Of course this also applies if you want to mimic a sequence
of refactorings, like the jexercise move method example. Sources will
then change so the next refactoring works from content of buffers, so you
have to handle this specifically.
|
-preload is the mechanism where the editor can send modified
buffers to c-xref so thay you don’t have to save between
refactorings, which is particularly important in the case of extract
since the extraction creates a default name which the editor then does
a rename of.
|
11.10. Utilities
11.10.1. Covers
utils/covers.py
is a Python script that, in some enviroments, can list which test cases execute a particular line.
This is handy when you want to debug or step through a particular part of the code.
Find a test that covers that particular line and run it using the debugger (usually make debug
in the test directory).
Synopsis:
covers.py <file> <line>
11.10.2. Sandboxed
utils/sandboxed
starts a sandboxed Emacs that uses the current elisp code and the c-xref
from src.
This allows you to run a test changes without having to polute your own setup.
This actually runs the tests/sandboxed_emacs
pure version, which also sets up a completely isolated Emacs environment with its own packages loaded, configuration etc.
See below.
Synopsis:
sandboxed
11.11. Debugging the protocol
There is a "pipe spy" in tests/sandboxed_emacs
. You can build the
spy using
make spy
and then start a sandboxed Emacs which invokes the spy using
make
This Emacs will be sandboxed to use its own .emacs-files and have HOME set to this directory.
The spy will log the communication between Emacs and the real
c-xref
(src/c-xref
) in log files in /tmp
.
NOTE that Emacs will invoke several instanced of what it believes is
the real c-xref
so there will be several log files to inspect.
12. Deployment
TBD.
13. Decison Log
Here we log all design and architectural decisions. Currently they are stored separately. Most of the actual, historic, decisons are of course lost.
14. Insights
This chapter contains notes of all insights, large and small, that I make as I work on this project. These insights should at some point be moved to some other, more structured, part of this document. But rather than trying to find a structure where each new finding fits, I’m making it easy to just dump them here. We can refactor these into a better and better structure as we go.
14.1. Yacc semantic data
As per usual a Yacc grammar requires each non-terminal to have a type.
Those types are named after which types of data they collect and
propagate. The names always starts with ast_
and then comes the
data type. For example if some non-terminal needs to propagate a
Symbol and a Position that structure would be called
ast_symbolParameterPair
("Pair" being thrown in there for good
measure…).
Each of those structures also always carries a begin and end position
for that structure. That means that any "ast" struct has three
fields, begin
, end
and the data. The data are sometimes a struct,
like in this case, but can also be a single value, like an int
or a
pointer to a Symbol
.
15. Archive
In this section you can find some descriptions and saved texts that described how things were before. They are no longer true, since that quirk, magic or bad coding is gone. But it is kept here as an archive for those wanting to do backtracking to original sources.
15.1. Memory strategies
There were a multitude of specialized memory allocation functions. In principle there where two types, static and dynamic. The dynamic could be exteded using a overflow handler.
Also one type had a struct where the actual area was extended beyond the actual struct. This was very confusing…
15.1.1. Static memory allocation
Static memory (SM_ prefix) are static areas allocated by the compiler
which is then indexed using a similarly named index variable
(e.g. ftMemory
and ftMemoryIndex
), something the macros took
advantage of. These are
-
ftMemory
-
ppmMemory
-
mbMemory
One special case of static memory also exist:
-
stackMemory
- synchronous with program structure and has CodeBlock markers, so there is a specialstackMemoryInit()
that initializes the outermost CodeBlock
These areas cannot be extended, when it overruns the program stops.
15.2. Trivial Prechecks
The refactorer can call the server using parseBufferUsingServer()
and add some extra options (in text form).
One example is setMovingPrecheckStandardEnvironment()
where it calls the server with -olcxtrivialprecheck
.
However parseBufferUsingServer()
uses callServer()
which never answerEditAction()
.
In answerEditAction()
the call to (unused) olTrivialRefactoringPreCheck()
also requires an options.trivialPreCheckCode
which is neither send by setMovingPrecheckStandardEnvironment()
nor parsed by processOptions()
.
The only guess I have is that previously all prechecks where handled by the -olcxtrivialprecheck
option in calls to the server, and have now moved to their respective refactorings.
This theory should be checked by looking at the original source of the precheck functions and compare that with any possible checks in the corresponding refactoring code. |
15.3. HUGE Memory
Previously a HUGE model was also available (by re-compilation) to reach file numbers, lines and columns above 22 bits. But if you have more than 4 million lines (or columns!) you should probably do something radical before attempting cross referencing and refactoring.
15.4. Bootstrapping
15.4.1. BOOTSTRAP REMOVED!
Once the FILL-macros was removed, we could move the enum-generation to
use the actual c-xref
. So from now on we build c-xref
directly
from the sources in the repo. Changes to any enums will trigger a
re-generation of the enumTxt-files but since the enumTxt-files are
only conversion of enum values to strings any mismatch will not
prevent compilation, and it would even be possible to a manual
update. This is a big improvement over the previous situation!
15.4.2. FILLs REMOVED!
As indicated in FILL macros the bootstrapping of FILL-macros has finally and fully been removed.
Gone is also the compiler_defines.h
, which was just removed without
any obvious adverse effects. Maybe that will come back and bite me
when we move to more platforms other than linux and MacOS…
Left is, at this point, only the enumTxt
generation, so most of the
text below is kept for historical reasons.
15.4.3. Rationale
c-xref uses a load of structures, and lists of them, that need to be created and initialized in a lot of places (such as the parsers). To make this somewhat manageable, c-xref itself parses the strucures and generates macros that can be used to fill them with one call.
c-xref is also bootstrapped into reading in a lot of predefined header files to get system definitions as "preloaded definitions".
Why this pre-loading was necessary, I don’t exactly know. It might be an optimization, or an idea that was born early and then just kept on and on. In any case it creates an extra complexity building and maintaining and to the structure of c-xref.
So this must be removed, see below.
15.4.4. Mechanism
The bootstrapping uses c-xref's own capability to parse C-code and parse those structures and spit out filling macros, and some other stuff.
This is done using options like `-task_regime_generate' which prints a lot of data structures on the standard output which is then fed into generated versions of strFill, strTdef(no longer exists) and enumTxt by the Makefile.
The process starts with building a c-xref.bs executable from checked in sources. This compile uses a BOOTSTRAP define that causes some header files to include pre-generated versions of the generated files (currently strFill.bs.h and enumTxt.bs.h) which should work in all environments.
if you change the name of a field in a structure that is subject to FILL-generation you will need to manually update the strFill.bs.h, but a "make cleaner all" will show you where those are. |
After the c-xref.bs has been built, it is used to generate strFill and enumTxt which might include specific structures for the current environment.
HOWEVER: if FILL macros are used for structures which are different on some platforms, say a FILE structure, that FILL macro will have difference number of arguments, so I’m not sure how smart this "smart" generation technique actually is.
TODO: Investigate alternative approaches to this generate "regime", perhaps move to a "class"-oriented structure with initialization functions for each "class" instead of macros.
15.4.5. Compiler defines
In options.h there are a number of definitions which somehow are sent to the compiler/preprocessor or used so that standard settings are the same as if a program will be compiled using the standard compiler on the platform. At this point I don’t know exactly how this conversion from C declarations to compile time definitions is done, maybe just entered as symbols in one of the many symboltables?
Typical examples include "__linux" but also on some platforms things like "fpos_t=long".
I’ve implemented a mechanism that uses "gcc -E -mD" to print out and
catch all compiler defines in compiler_defines.h
. This was necessary
because of such definitions on Darwin which where not in the
"pre-programmed" ones.
TODO?: As this is a more general approach it should possibly
completely replace the "programmed" ones in options.c
?
15.4.6. EnumTxt generation REMOVED!
To be able to print the string values of enums the module generate.c (called when regime was RegimeGenerate) could also generate string arrays for all enums. By replacing that with some pre-processor magic for the few that was actually needed (mostly in log_trace() calls) we could do away with that whole "generate" functionality too.
(Last commit with enum generation intact is https://github.com/thoni56/c-xrefactory/commit/aafd7b1f813f2c17c684ea87ac87a0be31cdd4c4.)
15.4.7. enumTxt
For some cases the string representing the value of an Enum is needed.
c-xref
handles this using the "usual" 'parse code and generate' method.
The module generate.c
does this generation too.
15.4.8. Include paths
Also in options.h some standard-like include paths are added, but there is a better attempt in getAndProcessGccOptions() which uses the compiler/preprocessor itself to figure out those paths.
TODO?: This is much better and should really be the only way, I think.
15.4.9. Problems
Since at bootstrap there must exist FILL-macros with the correct field
names this strategy is an obstacle to cleaning up the code since every
field is referenced in the FILL macros. When a field (in a structure
which are filled using the FILL macro) changes name, this will make
initial compilation impossible until the names of that field is also
changed in the strFill.bs.h
file.
One way to handle this is of course to use c-xrefactory
itself and
rename fields. This requires that the project settings also include a
pass with BOOTSTRAP set, which it does.
15.4.10. Removing
I’ve started removing this step. In TODO.org I keep a hierarchical list of the actions to take (in a Mikado kind of style).
The basic strategy is to start with structures that no other structure
depends on. Using the script utils/struct2dot.py
you can generate a
DOT graph that shows those dependencies.
Removal can be done in a couple of ways
-
If it’s a very small structure you can replace a call to a
FILL_XXX()
macro with a compound literal. -
A better approach is usually to replace it with a
fillXXX()
function, or even better, with anewXXX()
, if it consistently is preceeded with an allocation (in the same memory!). To see what fields vary you can grep all such calls, make a CSV-file from that, and compare all rows.
15.4.11. strTdef.h
The strTdef.h
was generated using the option -typedefs
as a part
of the old -task_regime_generate
strategy and generated typedef
declarations for all types found in the parsed files.
I also think that you could actually merge the struct definition with the typedef so that strTdef.h would not be needed. But it seems that this design is because the structures in proto.h are not a directed graph, so loops makes that impossible. Instead the typedefs are included before the structs:
#include "strTdef.h"
struct someNode { S_someOtherNode *this; ...
struct someOtherNode { S_someNode *that; ...
This is now ideomatically solved using the structs themselves:
struct someNode { struct someOtherNode *this; ...
struct someOtherNode { struct someNode *that; ...
15.5. FILL macros
The FILL macros are now fully replaced by native functions or some other, more refactoring-friendly, mechanism. Yeah!***
During bootstrapping a large number of macros named __FILL_xxxx is created. The intent is that you can fill a complete structure with one call, somewhat like a constructor, but here it’s used more generally every time a complex struct needs to be initialized.
There are even _FILLF_xxx macros which allows filling fields in sub-structures at the same time.
This is, in my mind, another catastrophic hack that makes
understanding, and refactoring, c-xrefactory
such a pain. Not to
mention the extra bootstrap step.
I just discovered the compound literals of C99. And I’ll experiment with replacing some of the FILL macros with compound literals assignments instead.
FILL_symbolList(memb, pdd, NULL);
could become (I think):
memb = (SymbolList){.d = pdd, .next = NULL};
If successful, it would be much better, since we could probably get rid of the bootstrap, but primarily it would be more explicit about which fields are actually necessary to set.
15.6. Users
The -user
option has now been removed, both in the tool and the
editor adaptors, and with it one instance of a hashlist, the
olcxTab
, which now is a single structure, the sessionData
.
There is an option called -user
which Emacs sets to the frame-id. To
me that indicates that the concept is that for each frame you create
you get a different "user" with the c-xref
server that you (Emacs)
created.
The jedit adapter seems to do something similar:
options.add("-user"); Options.add(s.getViewParameter(data.viewId));
Looking at the sources to find when the function
olcxSetCurrentUser()
is called it seems that you could have
different completion, refactorings, etc. going on at the same time in
different frames.
Completions etc. requires user interaction so they are not controlled by the editor in itself only. At first glance though, the editor (Emacs) seems to block multiple refactorings and referencs maintenance tasks running at the same time.
This leaves just a few use cases for multiple "users", and I think it adds unnecessary complexity. Going for a more "one user" approach, like the model in the language server protocol, this could really be removed.