Extended BASIC Assembler


Version 2.01,         12 June 2000   (BASIC 1.05-1.16)
        2.01 patch 2, 14 August 2000 (BASIC 1.20)
by Darren Salt <ds@youmustbejoking.demon.co.uk>
AOF code by Paul Clifford <paul@plasma.demon.co.uk>
Contains code from v1.00 and v1.30 by Adrian Lees

(This is a beta version.)


Licence


ExtBASICasm is freeware. No restrictions are placed on its use; what you do
with it is your own responsibility.

Release versions of ExtBASICasm may be freely distributed in unchanged form;
patched versions may not be distributed unless the patches are *clearly*
documented; such documentation must be distributed with the patched module.

Beta versions of ExtBASICasm must not be redistributed.


Intro


It is recommended that you read through this document before first using this
version of ExtBASICasm, and check through it quickly when upgrading. It's
entirely possible that one of the extras the module provides may cause a
clash with existing programs, for example the APCS-R register names clashing
with variables used as register names.

Note also that APCS-R register names are disabled by default.


The module ExtBASICasm provides a patch for various versions of BBC BASIC V
to allow the direct use of the extra instructions provided by the ARM3, ARM6,
ARM7 and StrongARM processors. The missing floating-point and general
coprocessor instructions, and some assembler directives more familiar (and a
few unfamiliar) to Acorn Assembler users have been added; also the APCS-R
register names may be used. Also, AOF files may be generated.

To make the necessary changes to the BASIC module it must be located in RAM.
The ExtBASICasm module will therefore attempt to RMFaster the BASIC module
which will require a small amount of memory in the RMA, in addition to that
required by the ExtBASICasm module itself. Attempting to run it while BASIC
is active and in ROM will not work - try "*RMFaster BASIC" at the BASIC
prompt and you'll see why.

  Ver	OS	Supported?
  1.05	3.1x	Yes
  1.06	3.5x	Yes
  1.14	3.6x	Yes
  1.16	3.7x	Yes
  1.17	3.8x	Yes
  1.18	4.00	No (1)
  1.19	4.01	No (1)
  1.20	4.0x	Yes

(1) Support for these versions will not be added since v1.20 is available for
    softload on RISC OS 4.


Enabling ExtBASICasm


Unlike some earlier versions, this version is initialised into a dormant
state whenever you start up the BASIC interpreter, eg. by double-clicking on
a BASIC program or by typing BASIC at the * prompt.

You can enable or disable the extensions by using the assembler pseudo-op
	EXT n
where n is 0 to disable and 1 to enable. (Other values are currently mapped
to 1; do not rely on this.)

Setting any of the extension OPT bits *NO* *LONGER* enables ExtBASICasm.

Certain extensions remain enabled at all times: specifically, ALIGN always
zero-fills, and the ".foo = bar" bug remains fixed. I don't think that
this'll inconvenience anybody :-)


ExtBASICasm uses the BASIC data word TIMEOF, which is documented as "unused"
for all versions of BASIC V which it recognises, for its 'enabled' flag. It
also uses the byte at &86E4 for its extended options byte.


The instructions added by the module are as follows:


Extensions


	Optional parts are enclosed in []

OPT	[<value>]
	Bit 4:	ASSERT control (1 = enabled on 'second pass')
	Bit 5:  APCS register names (1 = enabled)
	Bit 6:	UMUL/UMULL control (0 = short forms, 1 = long forms)
	Bit 7:	AOF control. If set, then the AOF extension is enabled.

	When ExtBASICasm is disabled, these bits take on their standard
	meanings: bit 4 allows use of, amongst others, the FP instructions.

	If value is omitted, the previous setting is used.

EXT	<flag>
	<flag>,<value>
	<flag>,
	Initialises or disables the extensions according to flag. The second
	form allows you to simultaneously set the OPT value; the third (a
	side effect of the use of the OPT code) causes the previous OPT value
	to be used.

ALIGN
	Zero-initialises the memory if required.

ALIGN	<const>[,<const2>]
	Aligns to a multiple of const bytes plus an optional offset. const
	must be a power of 2 between 1 and 65536; const2 must be between 0
	and const-1 (default is 0). Also zero-initialises the memory.
	P% becomes (P%+const-1 AND const-1)+const2; O% is also updated if
	necessary. Examples:
		ALIGN 4
		ALIGN 32
		ALIGN 16,8

MUL{cond}{S}	Rd,Rm,#<const>
	variable length; Rd=Rm if <2 ADD/RSB
	May cause 'duplicate register' if Rd=Rm and const is not simple - ie.
	not 0, (2^x)-1, 2^x, (2^x)+(2^y)

MLA{cond}{S}	Rd,Rm,#<const>,Ra
	variable length; Rd=Rm if <2 ADD/RSB
	Rd=Ra causes 'duplicate register' error if const is not simple, as
	for MUL; Rd=Rm=Ra is special in that MLA Rd,Rd,#c,Rd = MUL Rd,Rd,#c+1
	If Rd=Ra and const=0, no code is generated (none necessary).

DIV		Rq,Rr,Rn,Rd,Rt [SGN Rs]
	Integer division by register
	Rq = quotient		Rn = numerator		Rt = temporary store
	Rr = remainder		Rd = denominator	Rs = sign store
	If Rs omitted then division is unsigned.
	Rr may be same register as Rn *or* Rn may be same as Rs.
	All other registers must be different.
	Rt and Rs (if specified) are corrupted.

DIV		Rq,Rr,Rn,#d[,[Rt]] [SGN Rs]
	Integer division by constant
	Registers as above
	If Rs omitted then division is unsigned.
	If Rt omitted and is required for this division then error given.
	All registers must be different.
	If specified, Rt and Rs are corrupted.
	(Uses generator to build code - fast but may be long)
	Notes:	Uses Fourier method. For unsigned values, this is fixed to
		handle unsigned top-bit-set properly, *except* for div by 3
		which works for values up to &C0000000. Ideas and code
		gratefully received...

	*** Note no conditional for DIV

SQR		Rt,Rr,Rx,Ry
SQR		Rt,Rr,{Rx,Ry}
	Square root
	Input	Rr = value, Rx & Ry = work registers
	Output	Rt = square root, Rr = remainder, Rx = 0, Ry corrupt
		In effect, Rt = INT SQR Rr and Rr' = Rr-Rt*Rt
	If {} are used, then Rx and Ry are preserved via STMFD R13!,{Rx,Ry}
	and restored via LDMFD R13!,{Rx,Ry}.

	*** Note no conditional for SQR

ADR{cond}L	Rd,<const>
	Fixed length (two words)

ADR{cond}X	Rd,<const>
	Fixed length (three words)

ADR{cond}W	Rd,<const>
	Addressing relative to R12, one to three words
	<const> MUST be defined before it is used
	Adds/subtracts const to/from R12, storing result in Rd
	Up to you to ensure that R12 correctly set up...

LDR, STR
	xxx{cond}{B}W	Rd,<offset>
	  Load/store word/byte at [R12,#<offset>]

	LDR{cond}{B}L	Rd,<address>
	LDR{cond}{B}L	Rd,[Rm,#<offset>]{!}
	LDR{cond}{B}WL	Rd,<offset>
	STR equivalents
	  Addressing range is 1MB; some offsets outside this range are also
	  valid. Lengths are (in words):
	    LDR	       2  ADD/SUB Rd,Rm,#a:LDR Rd,[Rd,#b]
	    LDR ...]!  2  ADD/SUB Rm,Rm,#a:LDR Rd,[Rm,#b]!
	    STR	       3  ADD/SUB Rm,Rm,#a:STR Rd,[Rm,#b]:SUB/ADD Rm,Rm,#a
	    STR ...]!  2  ADD/SUB Rm,Rm,#a:STR Rd,[Rm,#b]!

	LDR{cond}{B}L	Rd,{Rn},<address>
	LDR{cond}{B}L	Rd,{Rn},[Rm,#<offset>]
	LDR{cond}{B}WL	Rd,{Rn},<offset>
	STR equivalents
	  [{Rn} is NOT optional]
	  Equivalent to the LDR/STRs above, except that Rn (rather than Rd)
	  is used to hold the address; always two words long. For example,
	  ADRL R0,wibble:LDR R1,[R0] may be replaced with LDRL R1,{R0},wibble
	  - one word shorter.
	  Rd=Rn is not allowed.
	  Assembles to ADD/SUB Rn,Rm,#a : LDR/STR Rd,[Rn,#b]!

	LDR{cond}{B}L	Rd,[Rm],#<offset>
	STR equivalent
	  Addressing range is 1MB; some offsets outside this range are also
	  valid. Two words long.
	  Assembles to LDR/STR Rd,[Rm],#b:ADD/SUB Rm,Rm,#a

	NOTE: You should try to avoid using *sequences* of LDRLs or STRLs -
	there is usually a more efficient way.

	LDRxxH, LDRxxSH, LDRxxSB and STRxxH
		The W forms are supported.
		Long LDR{H|SH|SB} not yet implemented.

SWAP{cond}[S]	Rd,Rn
	Swaps Rd and Rn without using temporary store.
	Uses EOR method, is therefore three words long.
	If S is specified, then the flags are set according to Rn.

VDU{cond}{X}	<const>
	= SWI "OS_WriteI"+<const>
	With X present, XOS_WriteI is used instead.

NOP{cond}
	= MOV{cond} R0,R0
	(BASIC supports only an unconditional NOP.)

BRK{cond} [#<const>]
	Undefined instruction. If <const> is specified, then R14 is set to
	this value before the undefined instruction trap is taken.

EQUx, DCx, =
	xxx <value>[,<value>]^
	Extended form of EQUD, EQUW, DCB, etc.
	Instead of, eg. DCD 0 : DCD 12 : DCD branch
	you can now use DCD 0, 12, branch

Negative constants
	Allowed in the following instructions:
		ADD, SUB	ADC, SBC	ADF, SUF
		AND, BIC	MOV, MVN	MVF, MNF
		CMP, CMN	CMF, CNF	CMFE, CNFE
	If the constant is invalid for one of these, it is negated or
	inverted, as appropriate, and the instruction changed to the other of
	the pair (eg. ADC becomes SBC). If the constant is still invalid, the
	"bad immediate constant" error is generated as normal.


ARMv3 (ARM6) and later


MSR{cond}	<psr>_<f>,Rm
	(Standard extension.)
	<f> may, in addition to the standard combinations of 'c' 'x' 's' 'f',
	be one of:
		ctl	control bits only
		flg	flag bits only
		all	both
	Any combination of 'cf' is equivalent to 'all'; you may also, in the
	standard form, use '_' between letters.


ARMv4 (ARM8, StrongARM) and later


UMUL, SMUL, UMLA, SMLA:
	xxx{cond}{S}	Rl,Rh,Rm,Rn

	The 'official' forms UMULL, SMULL, UMLAL, SMLAL are used *instead of*
	the 'short' forms if extended OPT bit 6 is set.
	Unfortunately it's not possible to allow both forms at once: how
	would you interpret "UMULLS" - UMUL condition LS or UMULL with S bit?


Floating-point instructions


Floating point coprocessor data transfer

LDF, STF:
	xxx{cond}precW	Fd,<offset>

LFM{cond}{stack}	Fd,m,[Rn]{!}
SFM{cond}{stack}	Fd,m,[Rn]{!}
LFS{cond}{stack}	Rn{!},<fp register list>
LFS{cond}{stack}	Rn{!},<fp register list>

	LFM, SFM, LFS and SFS use extended precision. The <fp register list>
	is much as for LDM and STM, with restrictions: you must specify a
	register or a sequence of registers, and the list must be compatible
	with LFM and SFM - eg.
	LFSFD R13!,{F3}		LFMFD F3,1,[R13]!	LFM F3,1,[R13],#12
	SFSFD R13!,{F5-F0}	SFMFD F5,4,[R13]!	SFM F5,4,[R13,#-36]!
	LFSDB R13,{F1,F0}	LFMDB F0,2,[R13]	LFM F0,2,[R13,#-24]
	- for each row, all the instructions have the same effect.
	Available stack types are DB, IA, EA, FD.
	Note that example 2 wraps around - F5, F6, F7, F0 _in that order_.


Assembler directives


* Conditional - will STOP if expression is FALSE:

ASSERT	<expression>

	Bit 4 of the extended OPT value controls ASSERT. When it and bit 1
	are zero, ASSERTs are ignored.

* Constants

=	<const|string>
	The bug causing an error when used in the form
		.label = "something"
	has been fixed.

EQUF, DCF
	xxx	<const>

	Synonyms for EQUFD.

EQUP, DCP, P
	xxx	<string>,<const>
	xxx	<const>,<string>
	Fixed-length string allocation. If the string is too short, then the
	remaining space is padded with nulls; if it is too long, it is
	truncated to the specified length.

EQUPW, DCPW, PW
	xxx	<pad_byte>,<string>,<const>
	xxx	<pad_byte>,<const>,<string>
	Like EQUP, except that you specify the padding byte.

EQUZ, DCZ, Z
	xxx	<string>
	EQUS with automatic zero termination

EQUZA, DCZA, ZA
	xxx	<string>
	Equivalent to EQUZ followed by ALIGN

	Note: *ALL* the EQU... directives (and their equivalents) may have
	their arguments repeated as described in the Extensions section.

FILL, %
	xxx{B|W|D}	<const>{,{<value>}}
	Allocates <const> bytes of memory, initialised to <value> (or 0).
	B, W and D represent data lengths as for EQU; if omitted, then byte
	length is assumed. If the comma is present but no fill value, this is
	equivalent to adding the constant to P% (and O% if appropriate).

FILE	<filename>
	Loads the specified file, allocating just enough space for it.

^	<offset>
	Initialises the workspace address pointer to the given value.
	This is used and updated by #.
	Typical use:
		^ 0
		...
		# flags, 4
		...
		LDRW	R0,flags

#	<variable>, <length>
	Sets the variable to the current value of the workspace address
	pointer, which is then incremented by <length>.
	This does not alter P% or O%.
	(Note: the variable is assigned before the length is evaluated.)

COND	<cond>
	Sets the condition code for use with = (when used as a condition
	code). It may be supplied as a condition code literal, a number (0 to
	15), or a string containing a condition code literal. For example,
	all of the following are equivalent:
		COND	7		; number
		COND	VC		; condition code literal
		COND	vc		; condition code literal
		COND	"Vc"		; string containing cond. code lit.
	Example code:
		COND	LT		; select LT condition code
		MOV=	R0,#2		; MOVLT  R0,#2
		MOV=S	R1,R2		; MOVLTS R1,R2


AOF generation


AREA	"name", "attributes"

	An area is a named block of code or data that can be manipulated by a
	linker, such as drlink, to form the final program image. Typically, a
	program will be divided into two main areas; one for code and the
	other for data. For example, Acorn's C compiler uses areas named
	"C$$code" and "C$$data".

	Each area has a set of attributes which provide extra information to
	the linker. The attributes recognised by ExtBASICasm are listed below
	(the case is ignored):

	32BIT	    The code complies to the 32-bit variant of the APCS.
	CODE	    Contains machine code instructions.
	DATA	    Contains data, not instructions.
	EXTFP	    The extended floating-point instruction set is used (LFM
		    and SFM instead of multiple LDFEs and STFEs).
	NOCHECK	    The code complies with a variant of the APCS without
		    software stack-limit checking.
	PIC	    Position Independent Code; will execute where loaded
		    without modification.
	READONLY    This area will not be written to.
	REENTRANT   The code complies with the reentrant APCS standard.

	Each attribute should be separated from the next by a comma (spaces
	are optional and ignored when processing the list). It is important
	to make sure that all areas of the same name also have the same
	attributes, otherwise the linking process will fail.

	Example area definitions:

	AREA	"C$$code", "CODE, READONLY"
	AREA	"MyProgram$Data", "DATA"


IMPORT	"symbol" ["attributes"]
IMPORT	"symbol", "alternative name" ["attributes"]

	References between areas are handled via "symbols". Each area exports
	a list of symbols that are available for external use, eg "strlen",
	"printf", and "fread" from the C library stubs. An area can then
	import the symbol and use it as if it were defined locally, leaving
	the linker to resolve the references later.

	IMPORT makes an external symbol available for use within the current
	area. It creates a variable of the same name, ending with an @, which
	can be used with STR, LDR, EQUD, EQUW and B instructions. An
	alternative name can be supplied, making it possible to import
	symbols containing characters that are illegal in a BASIC variable
	name, such as $. Example uses:

	IMPORT  "strcpy"		; import strcpy as strcpy@
	IMPORT  "an$example", "example" ; import an$example as example@

	BL	strcpy@			; call the strcpy routine
	LDR	R0, example@		; load a word from an$example

	Note: The '@' variables should always be treated as read-only.

	The optional attributes are as follows:

	FPREG	    This is only meaningful if the imported symbol is a
		    function entry point and indicates that floating point
		    arguments will be passed in floating point registers.
	INSENSITIVE The linker will ignore the case when trying to resolve
		    the reference.
	NOCASE	    A synonym for INSENSITIVE.
	WEAK	    It is acceptable for the reference to remain unsatisfied.

	Example:

	IMPORT  "xyz", "abc" ["nocase"] ; import xyz as abc@, ignoring case


EXPORT	"symbol" ["attributes"]
EXPORT	"symbol", "alternative name" ["attributes"]

	EXPORT makes an address within the current area available for outside
	use. The "symbol" is the BASIC variable containing the address and,
	if the alternative name is missing, the name under which it is
	exported. If the variable is an integer you would probably want to
	supply an alternative name to remove the % at the end. Example uses:

	EXPORT  "compare%", "uint_cmp"  ; export compare% as uint_cmp
	EXPORT  "value", "an$integer"	; export value as an$integer
	EXPORT  "value"			; also export value as value

	.compare%
	CMP	R0, R1
	MOVLO	R0, #-1
	MOVEQ	R0, #0
	MOVHI	R0, #1
	MOVS	PC, R14
	.value
	DCD	&12345678

	The optional attributes are:

	DATA	This is only meaningful if the symbol occurs in a code area,
		and indicates that the symbol defines data rather than code.
	FPREG	This is only meaningful if the symbol defines a function
		entry point and indicates that floating point arguments
		should be passed in floating point registers.
	LEAF	The symbol defines a function call that makes no calls to any
		any other functions.
	STRONG	This symbol should be used in preference to any other
		non-strong symbol when resolving references in other files.
		Any references to the symbol within the same file as the
		strong definition resolve to the non-strong definition. This
		allows a kind of link-time indirection.

	Example:

	EXPORT  "leaf_fn" ["leaf"]	; export leaf_fn as a leaf symbol


INCLUDE	"filename"

	INCLUDE will load the specified BASIC program as a library (first
	pass only) and call the definition of FNinclude in that program, if
	one exists. The function call ignores definitions of FNinclude
	elsewhere, such as other INCLUDE'd files.


END
SAVE	"filename"

	It is necessary to explicitly mark the end of an AOF file, to allow
	the extra data to be inserted in the correct place, and to provide a
	means of determining how many passes have been carried out. Either
	END or SAVE "filename" can be used for this purpose; the latter will
	automatically save the AOF output on the second pass. For example:

	SAVE	"o.program"


LDR	register, =expression
LDF*	register, =expression
LTORG

	Literals are used to load immediate values that cannot be handled by
	the MOV/MVN and MVF/MNF instructions. The expression is evaluated and
	stored in the nearest following literal pool, and the instruction is
	assembled to load from the value from this address. If the expression
	contains an imported symbol, the necessary relocation directive will
	be transparently added.

	A literal pool is automatically added at the end of each area, but
	extra literal pools can be created using the LTORG directive. This is
	particularly useful when using floating point literals as the LDF
	instruction only has a range of +-1020.

	Examples:

	LDR	R0, =&4b534154		; R0 = &4b534154 ("TASK")
	LDFS	F1, =3.1415926536	; F1 = (float) 3.1415926536
	LDR	R7, =external@ + 4	; R7 = address of external + 4


HEAD	"function name"

	This adds a function name header, used by backtraces and
	disassemblers to name functions. For example:

	EXPORT  "compare"
	HEAD	"compare"
	.compare
	; ...
	MOVS	PC, R14


ENTRY

	The ENTRY directive is used to tell the linker where program
	execution should begin.


ORG	address

	ORG sets a base address for the current area. It should be used
	carefully as it may cause problems for the linker, and only really
	makes sense if the code needs to be mapped to a fixed hardware
	location.


ROUT
ROUT	"routine name"

	ROUT marks the beginning of a new local label block. Local labels can
	be defined multiple times in a single source file and are
	particularly useful in macros. For example:

	DEF FNexample
	[	OPT	pass%
		ROUT
		TST	R0, #1
		BNE	@10
		; ...
		B	@20
	.10
		; ...
	.20
	]
	= 0

	A local label always starts with a number and can optionally be
	followed by the routine name, as supplied to the preceding ROUT. It
	is an error to supply differing names; local labels outside the
	current label block are hidden. References to a local label begin
	with an @.


Notes


* Registers are specified in the following form:

	ARM registers:			R0..R15
		using APCS-R names:	A1..A4 V1..V6 SL FP IP SP LR PC
	Floating-point registers:	F0..F7
	General co-processor registers:	C0..C15

  To help to cope with any potential name clashes, the floating point and
  APCS-R register names (except for PC) must be terminated with some
  character not valid in a variable name in order to be recognised; they are
  otherwise treated as part of a variable name.

* Coprocessor numbers (CP#) may be specified using either of the following
  forms:

	P0..P15
	CP0..CP15

* Wherever a register or coprocessor number is specified, an expression may
  be substituted in the usual manner allowed by BASIC V. This module employs
  the routines used within BASIC to evaluate all expressions (eg. register
  numbers, offsets and labels) and hence its interpretation of expressions is
  guaranteed to be the same as BASIC.


Credits


  Adrian Lees (last known, AFAIK, at A.M.Lees-CSEE93@@cs.bham.ac.uk):
  - for the original ExtBas and the EQU comma extension, and for the use of
    some of his code

  Michael Rozdoba (mroz@argonet.co.uk; remember TechForum?):
  - for including the "General recursive method for Rb := Ra * C, C a
    constant" from Appendix C of the manual for Acorn's desktop assembler,
    and the late Acorn Computing (Sept 1994) for printing it;
  - for the division code generator (Archimedes World, May 1995), which was
    included, slightly trimmed, and debugged to handle top-bit-set unsigned
    numbers properly... I hope!

  Dominic Symes of !Zap fame (dominic.symes@armltd.co.uk):
  - for pointing out that ANDEQ R0,R0,R0 could usefully be replaced by DCD 0

  Martin Willers (m.willers@tu-bs.de):
  - for bug hunting :-)

  Reuben Thomas (rrt1001@cam.ac.uk):
  - for pointing out it might be useful to disable the APCS register names,
    suggesting B/W/D suffix for FILL (and %) and -ve immediate constants, and
    bug encountering

  Mohsen Alshayef (mohsen@qatar.net.qa):
  - for some useful long MUL, STRH and [CS]PSR info

  Michael Kircher (kircher@ph-cip.uni-koeln.de):
  - for the integer square root code, and a few bug reports
